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ABSTRACT 


This  document  provides  guidance  for  investigating  and  conducting  root  cause  analyses  of  safety- 
related  incidents  associated  with  Air  Force  maintenance  actions.  The  impetus  for  and  sole  focus 
of  this  document  are  incidents  in  which  detectable  cracks  in  safety-of-flight  aircraft  structure  have 
been  missed  by  approved  nondestructive  inspection  (NDI)  techniques.  The  tools  defined  herein, 
however,  can  be  adapted  and  utilized  for  the  investigation  of  any  maintenance-related  incident. 

The  role  of  root  cause  analysis  in  incident  investigation  is  emphasized  in  this  document.  Root 
cause  analysis  is  intended  to  answer  the  questions  “What?,”  “How?,”  and,  most  importantly, 
“Why?”  regarding  an  incident.  A  successful  root  cause  analysis  will  identify  the,  controllable, 
causal  factors  that  can  be  corrected  to  eliminate  the  recurrence  of  similar  incidents  in  the  future. 

This  document  provides  a  discussion  of  methodologies,  instructions,  and  worksheets  that  may  be 
used  in  a  root  cause  analysis.  Though  there  are  numerous  methods  to  use  when  conducting  a  root 
cause  analysis,  this  guide  provides  focuses  on  the  application  of  the  following  four  methods: 

•  Sequential  Events  and  Causal  Factor  Analysis  (Appendix  A) 

•  Cause  and  Effect  Analysis  (Appendix  B) 

•  Change  Analysis  (Appendix  C) 

•  Human  Performance  Evaluation  (Appendix  D) 
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1.  SUMMARY 


This  document  is  a  guide  for  conducting  investigations  of  failed  systems,  processes  or 
components,  specifically  those  related  to  incidents  in  which  otherwise  detectable  cracks  in  safety- 
of-flight  aircraft  structure  were  missed  during  a  nondestructive  inspection  (NDI)  process.  The 
basic  reason  for  investigating  and  reporting  the  causes  of  incidents  is  to  enable  the  identification 
of  corrective  actions  that  are  adequate  to  prevent  recurrence  and  thereby  protect  the  safety-of- 
flight  of  Air  Force  weapons  systems  as  well  as  to  ensure  the  safety  of  the  pilots,  passengers, 
maintainers  and  the  civilian  community. 

Root  cause  analysis  is  a  key  component  of  these  investigations.  During  a  root  cause 
analysis,  numerous  causal  factors  (i.e.  “causes)  may  be  identified.  Identification  of  causal  factors 
ensures  that  deficiencies  within  the  control  of  the  inspection  organization  and  processes  are  also 
identified.  This,  in  turn,  guides  early  corrective  actions  and  controls.  Therefore,  root  cause 
analysis  is  central  to  ensuring  robust  maintenance  practices  are  maintained. 

Root  cause  analyses  identify  the  “what,”  the  “how,”  and  the  “why”  associated  with 
incidents  that  are  investigated.  A  successful  root  cause  analysis  will  identify  the  controllable, 
causal  factors  that  can  be  corrected  to  eliminate  the  recurrence  of  similar  incidents  in  the  future. 

One  may  perform  a  root  cause  analysis  using  a  number  of  different  methods.  This  guide 
recommends  and  summarizes  methods  which  appear  to  be  most  applicable  to  analyzing 
inspection  related  incidents  in  the  U.S.  Air  Force;  however,  the  basic  approach  and  root  cause 
analysis  techniques  described  in  this  document  can  apply  to  the  investigation  of  any  type  of 
failure  of  a  system,  process,  or  component. 

The  level  of  effort  expended  on  such  analyses  should  be  commensurate  with  the 
significance  or  severity  of  the  incident.  Most  off-normal  incidents  will  require  only  a  scaled 
down  effort  while  most  emergency  incidents  should  be  investigated  using  one  or  more  of  the 
formal  analytical  models. 

The  DOE  Root  Cause  Analysis  Guidance  DOE-NE-STD- 1004-92  was  a  significant 
source  in  the  development  of  this  document.  Other  resources  were  also  utilized  and  are 
enumerated  in  the  reference  list. 
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2.  INVESTIGATION  PHASES 


(Methodology  and  Phases  to  Investigate  Undetected  Cracking  Incidents  in  Safety-of-Flight 
Aircraft  Structures) 

When  investigating  the  root  cause  of  inspection  related  incidents  the  following 
methodology  should  be  followed: 

Step  1.  Identify  the  incident  as  a  Type  (I,  II,  or  III),  using  the  guidance  of  Table  1,  based 
on  the  circumstances  surrounding  the  incident. 

Step  2.  Identify  the  root-cause  analysis  methods  that  should  be  applied  by  using  the 
guidance  of  Table  1. 

Step  3.  Form  the  investigation  team  (Refer  to  Section  4)  based  on  the  resources  end 
expertise  required  to  complete  the  investigation. 

Step  4.  Conduct  the  five-phased  root  cause  investigation  (see  details  below).  Use  the 
attached  appendices  for  guidance  on  performing  the  various  recommended  root-cause  analysis 
methods. 

Keep  in  mind  that  regardless  of  the  method(s)  used  for  a  root  cause  analysis,  the  overall 
investigation  and  reporting  process  (Step  4)  should  include  the  five  phases  as  highlighted  below. 
While  there  may  be  some  overlap  between  phases,  every  effort  should  be  made  to  keep  them 
separate  and  distinct.  Management  involvement  and  adequate  allocation  of  resources  are 
essential  to  successfully  execute  the  five  investigation  and  reporting  phases. 

Phase  I.  Collecting  In  order  to  minimize  the  loss  of  data  and  information,  it  is 
important  to  begin  the  data  collection  phase  of  any  investigation  immediately  following  the 
incident.  The  information  that  should  be  collected  consists  of:  a)  conditions  before,  during,  and 
after  the  incident,  b)  personnel  involvement  (including  actions  taken),  c)  environmental  factors  d) 
historical  data  and  e)  other  information  having  relevance  to  the  incident.  (Details  for  conducting 
Phase  I  can  be  found  in  Section  6). 

Phase  II.  Analyzing  This  phase  consists  of  the  root  cause  analysis.  Any  root  cause 
analysis  method  may  be  used  that  includes  the  following  steps  (Details  for  conducting  Phase  II 
can  be  found  in  Section  7): 

1 .  Identify  the  problem. 

2.  Determine  the  significance  of  the  problem. 

3.  Identify  the  causes  (conditions  or  actions)  immediately  preceding  and  surrounding 
the  problem. 

4.  Identify  the  reasons  why  the  causes  existed,  working  back  to  the  root  causes  (the 
fundamental  reason  which,  if  corrected,  will  prevent  recurrence  of  these  and  similar 
incidents).  Identification  of  the  root  causes  concludes  the  analysis  (or  assessment) 
phase. 


3 


Phase  III.  Correcting.  Implementing  effective  corrective  actions  for  each  root  cause 
reduces  the  probability  that  a  problem  will  recur,  improves  reliability,  and  enhances  safety. 
(Details  for  conducting  Phase  III  can  be  found  in  Section  8). 

Phase  IV.  Informing.  Management  and  personnel  involved  with  the  incident  should  be 
informed  of  the  results  of  any  investigation.  Key  information  that  should  be  reported  includes 
findings,  the  results  of  the  root  cause  analysis,  and  all  corrective  actions  recommended  and/or 
implemented.  Consideration  should  also  be  given  for  distributing  the  results  of  the  investigation 
to  other  pertinent  organizations  (Details  for  conducting  Phase  IV  can  be  found  in  Section  9). 

Phase  V.  Verifying.  This  phase  includes  determining  if  corrective  action  has  been 
effective  in  resolving  problems.  An  effectiveness  review  is  essential  to  ensure  that  corrective 
actions  have  been  implemented  and  are  preventing  recurrence.  Management  involvement  and 
adequate  allocation  of  resources  are  essential  to  successful  execution  of  the  five  investigations 
and  reporting  phases  (Details  for  conducting  Phase  V  can  be  found  in  Section  10). 
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3.  DEFINITIONS 


Causal  Factor  (Cause)  -  A  condition  or  an  event  that  results  in  a  failure  or  incident 
(anything  that  shapes  or  influences  the  outcome).  For  example:  a)  poor  signal-to-noise  ratios  in 
inspection  instruments,  b)  failures  to  follow  defined  procedures,  c)  failure  to  validate  procedures, 
d)  weaknesses  or  deficiencies  in  management  or  administration. 

Causal  Factor  Chain  (Sequence  of  Events  and  Causal  Factors)  -  A  cause  and  effect 
sequence  in  which  a  specific  action  creates  a  condition  that  contributes  to,  or  results  in,  an  event. 
Each  event  may  create  new  conditions  that,  in  turn,  contribute  to  or  result  in  another  incident. 

The  sequence  of  events  ultimately  leads  to  the  incident  under  investigation. 

Condition  -  Any  as-found  state,  whether  or  not  resulting  from  an  event,  that  may  have 
adverse  safety,  operational,  readiness,  or  mission  capability  implications.  An  (existing)  error  in 
assumed  detectable  flaw  size,  an  anomaly  associated  with  (resulting  from)  design  or  performance, 
poorly  written  procedures  or  an  item  indicating  a  weakness  in  the  management  process  are  four 
example  conditions. 

Contributing  Cause  -  A  cause  that  contributed  to  an  incident  but,  by  itself,  would  not 
have  caused  the  incident.  For  example,  in  the  case  of  the  failure  of  an  inspection  to  detect  a  large 
crack  in  safety-of-flight  structure,  a  contributing  cause  could  be  an  inspector’s  fatigue  reducing 
the  inspectors  focus  to  the  task.  This  condition,  by  itself,  was  not  a  direct  or  root  causal  factor. 

Crack  Lengths  -  Various  crack  lengths  are  of  interest  when  performing  a  root  cause 
analysis  of  cracks  that  are  undetected  by  nondestructive  inspection  techniques.  Many  of  these 
crack  lengths  are  graphically  defined  in  Figure  1.  These  crack  lengths  include: 

cio,  cii,  or  a inj,  -  the  length  of  the  starting,  inherent,  or  initial  crack  existing  within  a 
structure  at  the  time  of  manufacture 

(i  \Di  ~  the  length  of  the  largest  crack  that  an  NDI  method  can  miss  based  upon  probability 
of  detection  (POD)  studies  and  usually  corresponding  to  a  “90/95”  crack  length  (i.e.  a  crack 
length  that  can  be  found  90%  of  the  time  with  95%  confidence).  Also  known  as  the  “minimum 
detectable  crack  size.” 

amiss  ~  the  length  of  a  crack  missed  (undetected)  by  a  nondestructive  inspection 

acriticab  or  acr  -  the  critical  length  of  a  crack  above  which  the  crack  will  grow 
catastrophically  to  failure  upon  the  application  of  the  next  design  limit  load  cycle. 

acr-miss  -  the  length  of  a  crack  that,  if  missed  (undetected)  by  a  nondestructive  inspection, 
will  grow  catastrophically  to  failure  (i.e.  to  «„.,-,)  before  the  next  inspection 

Direct  Cause  -  The  cause  that  directly  resulted  in  the  incident.  For  example:  If  during 
the  laboratory  development  of  an  inspection,  the  technique  was  found  to  be  capable  of  detecting  a 
given  size  crack;  however  after  the  procedure  was  fielded,  cracks  of  the  same  size  were  missed  in 
during  subsequent  inspections.  In  this  case  a  direct  cause  could  be  failure  to  issue  clear  and 
accurate  inspection  procedures. 

Event  -  An  event  can  be  thought  of  as  a  “building  block”  of  an  incident.  A  sequence  or 
chain  of  events  leads  up  to  the  incident  being  investigated. 
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Facility  -  Any  organization  or  location  that  fulfills  a  specific  purpose.  Examples  include 
Air  Logistics  Centers  (ALC’s),  field-level  inspection  laboratories,  flight-lines,  maintenance 
facilities,  firms  conducting  inspections  under  contract,  and  any  location  or  organization  where 
and  inspection  or  maintenance  actions  occur. 

Incident  (Failure  Event)  -  The  failure  of  a  system,  process,  or  component  that  has  the 
potential  to  seriously  impact  safety  and/or  mission  capability.  An  incident  is  usually  comprised 
of  a  series  of  events  brought  about  by  the  interaction  of  a  number  of  causes.  For  the  purposes  of 
this  guide,  incidents  are  failures  of  nondestructive  inspection  processes  to  detect  cracks  in  safety- 
of-flight  aircraft  structures.  Such  failures  (incidents)  may  be  classified  into  three  general 
categories  described  below  in  Figure  1  and  in  Table  1: 

Type  I  Incident  -  A  Type  I  incident  is  any  incident  that  resulted  in  or  could  result  in  a 
Class  A  mishap  or  that  poses  a  high  risk  to  flight  safety  or  mission  capability.  The  consequences 
of  a  Type  I  incident  are  a  loss  of  life  or  aircraft  or  effects  which  would  result  in  a  Class  A  mishap. 

An  undetected  crack  incident  is  classified  as  Type  I  if  a  crack  with  length  amiss  is 
undetected  and  if  In  this  case,  the  undetected  crack  with  length  can  grow  to  a 

length  of  u critica\  in  a  period  that  is  less  than  one  inspection  interval  (i.e.  it  can  grow  to  length  of 
a  critical  before  the  next  inspection) 

Type  I  incidents  require  the  use  and  documentation  of  a  formal  root  cause  analysis 
method  to  identify  the  causal  factors  and  program  deficiencies.  The  Sequential  Events  and 
Causal  Factor  Analysis,  Change  Analysis,  and  Cause  &  Effect  Analysis  methods  should  all  be 
used  together  in  an  extensive  investigation  of  the  causal  factor  chain.  Human  Performance 
Evaluation  may  also  be  required  for  Type  I  incidents  if  warranted. 

Type  II  Incident  -  A  Type  II  incident  is  any  incident  that  poses  a  moderate  risk  to  flight 
safety  and/or  mission  capability.  The  consequences  of  a  Type  II  incident  are  major  readiness  or 
economic  impacts. 

An  undetected  crack  incident  is  classified  as  Type  II  if  a  crack  with  length  amis%  is 
undetected  and  if  aNDi  <  amiss  <  a£T_„„ss.  In  this  case,  the  undetected  crack  with  length  amiss  cannot 
grow  to  a  length  of  aOTftCfl/  in  a  period  that  is  less  than  one  inspection  interval  (i.e.  it  cannot  grow 
to  length  of  acriticai  before  the  next  inspection). 

The  investigation  of  a  Type  II  incident  requires,  at  a  minimum,  the  use  of  the  Sequential 
Events  and  Causal  Factor  Analysis  and  Cause  &  Effect  Analysis  methods.  The  Change  Analysis 
and  Human  Performance  Evaluation  methods  may  also  be  required  for  Type  II  incidents  if 
warranted. 

Type  III  Incident  -  A  Type  III  incident  is  any  incident  that  poses  a  low  risk  to  flight 
safety  or  mission  capability  but  that  may  be  an  indication  of  wider  program  issues  that  require 
attention.  Type  III  incidents  may  not  have  any  significant  consequences  beyond  the  need  to  apply 
standard  repairs  (find  and  fix). 

An  undetected  crack  incident  is  classified  as  Type  III  if  a  crack  with  length  amiss  is 
undetected  and  if  <  aNDj.  In  this  case,  the  undetected  crack  with  length  amiss  cannot  grow  to  a 
length  of  aCriticai  in  a  period  that  is  less  than  two  inspection  intervals  (i.e.  it  cannot  grow  to  length 
of  a  critical  before  two  inspections  have  been  performed).  It  is  expected  that  the  next  inspection 
interval  will  detect  these  cracks  before  they  grow  to  length 

Type  III  incidents  require,  at  a  minimum,  gathering  information  and  drawing  conclusions 
without  requiring  the  use  of  any  formal  analytical  method.  However,  in  most  cases,  an 
understanding  of  the  root  cause  analytical  methods  is  important  to  conducting  an  adequate  inquiry 
and  drawing  correct  conclusions. 
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Table  1.  Incident  Types,  Potential  Consequences,  and  Recommended  Root-Cause 
Analysis  Methods  for  Undetected  Cracking  Incidents  in  Safety-of-Flight  Structures 


TYPE  II 

TYPE  III 

Consequence: 

Consequence: 

Consequence: 

Loss  of  or  Significant  Damage 
to  Aircraft 

Major  Readiness  or  Economic 
Impact 

Standard  Repair  Required 

Class  A  Mishap 

Criteria: 

Criteria: 

Criteria: 

&  miss  ^  Hcr-miss 

AnDI  —  Cl  miss  ^  Cl  Ci'~m  iss 

Cl  miss  ^CliSDI 

the  undetected  crack  can  grow 
to  &  critical  in  a  period  that  is 
less  than  one  inspection 
interval 

(i.e.  it  can  grow  to 
before  the  next  inspection) 

the  undetected  crack  cannot 
grow  to  a  critical  in  a  period  that 
is  less  than  one  inspection 
interval 

(i.e.  it  cannot  grow  to 
before  the  next  inspection) 

the  undetected  cannot  grow  to 
a  critical  in  a  period  that  is  less 
than  two  inspection  intervals 
(i.e.  it  cannot  grow  to  a  critical 
before  two  inspections  have 
been  performed) 

Root  Cause  Analysis 
Methods: 

Root  Cause  Analysis 
Methods: 

Root  Cause  Analysis 
Methods: 

Sequential  Event  and  Causal 
Factor  Analysis 

Sequential  Event  and  Causal 
Factor  Analysis 

Informal  data  gathering, 
analysis  and  reporting 

and 

and 

with 

Cause  and  Effect  Analysis 

and 

Cause  and  Effect  Analysis 

and 

Human  Performance 
Evaluation 
(if  warranted) 

Change  Analysis 

and 

Human  Performance 
Evaluation 

Human  Performance 
Evaluation 

with 

Change  Analysis 
(if  warranted) 
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Figure  1.  Establishing  the  Type  of  an  Undetected  Crack  Incident 


Incident  Report  -  An  incident  report  is  a  written  evaluation  of  an  incident  that  is 
prepared  in  sufficient  detail  to  enable  the  reader  to  assess  its  significance,  consequences,  or 
implications  and  to  evaluate  actions  being  employed  to  correct  the  condition  or  to  avoid 
recurrence. 

Reportable  Incident  -  An  event  or  condition  that  meets  the  factors  of  a  Type  I  or  Type  II 
incident. 

Root  Cause  -  The  cause  that,  if  corrected,  would  prevent  recurrence  of  the  incident  under 
investigation  and  similar  incidents.  The  root  cause  does  not  apply  only  to  the  incident  being 
investigated,  but  has  generic  implications  to  a  broad  group  of  possible  incidents.  It  is  the  most 
fundamental  cause  that  can  logically  be  identified  and  corrected.  When  a  series  of  related  or 
sequential  causes  can  be  identified,  the  series  should  be  pursued  until  the  fundamental, 
correctable  (i.e.  “root”)  cause  has  been  identified.  It  should  be  stressed  that  a  root  cause  is  a  cause 
that  is  correctable  or  controllable.  For  example,  adverse  weather  is  not  a  root  cause.  However,  a 
policy  that  governs  how  weather  should  affect  NDI  activities  could  be  a  root  causes  since  it  is 
controllable  and  correctable. 

For  example,  the  root-cause  of  an  undetected  crack  in  a  safety-of-flight  structure  could  be 
the  lack  of  sufficient  funds  or  schedule  caused  the  engineering  process  to  release  inspection 
procedures  to  the  field  before  sufficiently  validating  and  verifying  the  procedures.  This  could 
have  led  to  an  incorrect  application  of  an  inspection  technique  by  the  inspector,  which  limited 
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complete  coverage  of  the  inspection  area,  and  which,  ultimately,  led  to  the  lack  of  detection  of  an 
otherwise  detectable  defect. 

For  a  second  root  cause  example  of  the  undetected  crack  in  a  safety-of- flight  structure, 
suppose  that  the  inspection  was  performed  outside  and  that  weather  was  found  to  be  a  factor  in 
the  lack  of  detection.  A  root  cause  could  be  that  the  leadership  chain  chose  to  expedite  all 
maintenance  procedures,  regardless  of  their  relation  to  safety,  by  requiring  them  to  be  performed 
without  taking  time  to  hangar  the  aircraft.  A  heavy  war  time  OPTEMPO  and  poor  weather  were 
contributing  causes,  but  only  the  leadership  decision  was  truly  controllable  and  correctable. 

Thus,  the  root  cause  was  the  leadership  decision  and  was  neither  the  weather  conditions  nor  the 
pace  of  operations. 
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4.  OVERVIEW  OF  INCIDENT  INVESTIGATIONS 


The  objective  of  investigating  and  reporting  the  cause  of  incidents  is  to  enable  the 
identification  of  corrective  actions  that  are  adequate  to  prevent  recurrence  and  thereby  protects 
the  safety  of  an  aircraft  structure,  the  crew,  as  well  as  the  military  and  civilian  population  (i.e. 
plane  crashes  into  school  yard).  Programs  can  then  be  improved  and  managed  more  efficiently 
and  safely. 

The  investigation  process  is  used  to  gain  an  understanding  of  the  incident,  its  causes,  and 
what  corrective  actions  are  necessary  to  prevent  recurrence.  The  line  of  reasoning  in  the 
investigation  process  is: 

•  Outline  what  happened  step-by-step. 

•  Begin  with  the  incident  by  identifying  the  problem  (condition,  situation,  or  action 
that  was  not  wanted  and  not  planned). 

•  Determine  what  program  element  that  was  supposed  to  have  prevented  this 
incident  (was  it  lacking  or  did  it  fail)? 

•  Investigate  the  reasons  why  this  situation  was  permitted  to  exist. 

This  line  of  reasoning  will  explain  why  the  incident  was  not  prevented  and  what 
corrective  actions  will  be  most  effective.  This  reasoning  should  be  kept  in  mind  during  the  entire 
root  cause  analysis  process. 
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5.  THE  INVESTIGATION  TEAM 

Before  an  investigation  can  proceed,  the  investigating  team  must  be  formed.  This  team 
should  only  encompass  the  minimum  number  of  subject  matter  experts  required  to  effectively  and 
efficiently  accomplish  the  investigation.  Care  must  be  taken  in  selection  of  the  team  to  ensure 
complete  objectivity.  For  situations  were  the  incident  occurred  within  the  depot,  external 
expertise  should  be  utilized  to  ensure  investigation  impartiality.  For  investigation  of  Type  I  and 
II  incidents  on  U.S  Air  Force  safety-of-flight  aircraft  structures,  the  following  team  composition 
is  recommended. 

Airframe  ASIP  Manager  -Lead 
Depot  NDI  Manager  -  Co-Lead 
Depot  NDI  Production  Representative 
Depot  Engineering  Representative 
Quality  Representative 
Union  Representative 
Airframe  NDI  technician 
Additional  Subject  Matter  Experts: 

Human  Factors  Analyst 
Root-Cause  Analysis  Facilitator 

Additional  external  structural  and  NDI  experts  as  required  to  ensure 
investigation  impartiality 
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6.  PHASE  I  -  COLLECTING 


It  is  important  to  begin  the  Collecting  phase  of  the  root  cause  process  immediately 
following  the  identification  of  an  incident  to  minimize  the  loss  of  data  and  information. 

The  primary  objective  of  the  Collecting  phase  is  to  provide  information  that  can  be  used 
to  determine  the  direct,  contributing  and  root  causes  so  that  effective  corrective  actions  can  be 
taken  that  will  prevent  recurrence.  The  information  that  should  be  collected  consists  of 
conditions  before,  during,  and  after  the  incident;  personnel  involved;  actions  taken; 
environmental  factors;  and  any  other  relevant  information.  Considerations  when  determining 
what  information  is  needed  include: 

•  Activities  related  to  the  incident. 

•  Initial  or  recurring  problems  with  procedures,  equipment,  personnel,  software,  etc., 
associated  with  the  incident. 

•  Recent  administrative  program,  procedure  and/or  equipment  changes. 

•  Physical  environment  or  circumstances  such  as  facility  temperature,  humidity,  lighting, 
noise  levels,  etc. 

•  Human  factors  variables  such  as  inspector  fatigue  level,  attitude,  health,  etc. 

Some  methods  of  gathering  information  include: 

•  Conducting  interviews/collecting  statements  -  Interviews  must  be  fact-finding  and  not 
fault  finding.  Preparing  questions  before  the  interview  is  essential  to  ensure  that  all 
necessary  information  is  obtained.  The  causal  factor  work  sheets  in  Appendix  F  can  be 
used  as  a  tool  to  help  gather  information. 

Interviews  should  be  conducted,  preferably  in  person,  with  those  people  who  are  most 
familiar  with  the  problem.  Individual  statements  could  be  obtained  if  time  or  the  number 
of  personnel  involved  makes  interviewing  impractical.  Interviews  can  be  documented 
using  any  format  desired  by  the  interviewer. 

Although  preparing  for  the  interview  is  important,  it  should  not  delay  prompt  contact 
with  participants  and  witnesses.  The  first  interview  may  consist  solely  of  hearing  their 
narrative.  A  second,  more-detailed  interview  can  be  arranged,  if  needed.  The 
interviewer  should  always  consider  the  interviewee’s  objectivity  and  frame  of  reference. 
Also,  consider  interviewing  other  personnel  who  have  performed  the  job  in  the  past. 

•  Conducting  a  "walk-through"  recreation  of  events  leading  up  to  the  incident  -  It  may  be 
useful  to  perform  this  in  conjunction  with  interviews  to  help  identify  the  series  of  events 
and  roles  of  individuals  before,  during,  and  after  the  incident. 

•  Collecting  physical  evidence  -  Every  effort  should  be  made  to  preserve  physical  evidence 
such  as  failed  components,  inspection  records,  work  orders  and  procedures.  This  should 
be  done  despite  operational  pressures  to  restore  equipment  to  service.  Establishing  a 
quarantine  area,  or  tagging  and  segregation  of  pieces  and  material,  should  be  performed 
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for  failed  equipment  or  components.  Physical  evidence  includes  photographic 
documentation  of  the  incident  area  from  several  views. 

•  Reviewing  records  -  Review  relevant  documents  or  portions  of  documents  as  necessary 
and  reference  their  potential  use  in  support  of  the  root  cause  analysis.  Record  appropriate 
dates  and  times  associated  with  the  incident  on  the  documents  reviewed.  Examples  of 
documents  include  the  following: 

Operating  logs 
Training  records 
Correspondence 
Inspection/surveillance  records 
Maintenance  records 
Meeting  minutes 
Procedures  and  instructions 
Work  cards 
Vendor  Manuals 
Drawings  and  specifications 

Equipment  history  records  (repair,  calibration,  etc.) 

Failure  analysis  reports 
Design  basis  information 
Related  quality  control  evaluation  reports 
Work  orders 

•  Acquiring  related  information  -  Some  additional  information  that  an  evaluator  should 
consider  when  analyzing  the  causes  includes  the  following: 

o  Laboratory  tests,  such  as  destructive/nondestructive  failure  analysis. 

o  The  physical  layout  of  system,  component,  work  area,  including  layout  sketches 
and/or  photographs. 

o  Photographs  of  inspection  access,  physical  position  of  inspector,  instrument/display 
location,  and  probe  placement  during  inspection. 

o  Historical  information  of  previous  similar  incidents  if  they  have  occurred  at  the  same 
facility  or  facilities  with  similar  inspection  requirements. 

o  Instrument,  reference  standards  or  sensor  records  including  sensor  performance  and 
reference  standard  certifications,  calibration  records  or  correspondence  that  addressed 
system  performance  issues. 
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7.  PHASE  II  -  ANALYZING 


7.1.  Types  of  Root  Cause  Analysis  Methods 

Numerous  analytical  methods  exit  for  conducting  root  cause  analysis.  However,  for  the 
purposes  of  this  guide  the  following  methods  are  recommended.  Use  the  guidance  of  Table  1  to 
help  select  the  analysis  methods  to  use  for  a  given  incident  type. 

•  Sequential  Events  and  Causal  Factor  Analysis  (Appendix  A) 

•  Cause  and  Effect  Analysis  (Appendix  B) 

•  Change  Analysis  (Appendix  C) 

•  Human  Performance  Evaluation  (Appendix  D) 

The  extent  to  which  these  methods  are  used  and  the  level  of  analytical  effort  spent  on  root 
cause  analysis  should  be  commensurate  with  the  significance  of  the  incident.  A  high-level  of 
effort  should  be  spent  on  Type  I  incidents  related  to  safety-of- flight;  an  intermediate  level  should 
be  spent  on  most  unusual  incidents  (Type  II);  and  a  relatively  low-level  effort  should  be  adequate 
for  most  off-normal  incidents  (Type  III).  In  any  case,  the  depth  of  analysis  should  be  adequate  to 
explain  why  the  incident  happened,  determine  how  to  prevent  recurrence,  and  assign 
responsibility  for  corrective  actions.  An  inordinate  amount  of  effort  to  pursue  the  causal  path  is 
not  expected  if  the  significance  of  the  incident  is  minor. 

7.1.1.  Sequential  Events  and  Causal  Factor  Analysis 

The  Sequential  Events  and  Causal  Factor  Analysis  method  is  used  to  analyze  multi¬ 
faceted  problems  or  long,  complex  causal  factor  chains.  This  analysis  method  results  in  a  chart 
that  describes  the  time  sequence  of  a  series  of  events  and/or  actions  and  their  surrounding 
conditions.  The  time  sequence  of  actions  or  happenings  is  known  as  a  “causal  factor  chain”  or 
“chain  of  events”.  “Conditions”  are  things  that  shape  the  event  outcomes;  they  range  from 
physical  conditions  (such  surfaces  that  are  improperly  prepared  for  inspections)  to  inspector 
attitudes  and  organizational  safety  culture.  This  method  should  be  used  to  investigate  all  Type  I 
and  Type  II  incidents.  Furthermore,  this  method  must  be  used  in  conjunction  with  Cause  and 
Effect  Analysis  to  identify  all  possible  causes  (direct,  root  and  contributing).  Appendix  A 
provides  details  for  performing  Events  and  Causal  Factor  Analysis. 

7.1.2  Cause  and  Effect  Analysis 

The  Cause  and  Effect  Analysis  method  generates  and  sorts  hypotheses  about  possible 
causes  of  a  problem  within  a  process  by  asking  participants  to  list  all  of  the  possible  causes  and 
effects  for  the  identified  problem.  This  tool  organizes  a  large  amount  of  information  by  showing 
the  links  between  the  events  and  their  potential  or  actual  causes  and  provides  a  means  of 
generating  ideas  (brainstorming)  about  why  the  problem  is  occurring  and  possible  effects  of  that 
cause.  The  Cause  and  Effect  Analysis  method  allows  problem  solvers  to  broaden  their  thinking 
and  look  at  the  overall  picture  of  a  problem.  The  resulting  Cause  and  Effect  Diagram  can  reflect 
either  a)  causes  that  block  the  way  to  the  desired  state  or  b)  processes  needed  to  reach  the  desired 
state.  This  method  should  be  used  to  investigate  all  Type  I  and  Type  II  incidents.  Appendix  B 
describes  this  method. 
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7.1.3.  Change  Analysis 


Change  Analysis  investigates  a  problem  by  analyzing  the  deviation  between  what  events 
are  expected  and  what  events  actually  occurred.  The  evaluator  asks  what  occurred  to  make  a  task 
or  activity  result  in  a  failure  or  problem  when,  previously,  the  task  or  activity  was  successfully 
completed.  The  Change  Analysis  method  is  primarily  used  to  investigate  complex  problems  or 
when  previous  similar  actions  or  processes  resulted  in  a  positive  outcome. 

This  method  should  be  used  in  conjunction  with  the  Sequential  Events  and  Causal  Factor 
Analysis  method  and  the  Cause  &  Effects  Analysis  method  for  the  investigation  of  Type  I 
incidents.  It  may  also  be  used  to  augment  investigations  of  Type  II  incidents.  Appendix  C 
describes  this  method. 

7.1.4.  Human  Performance  Evaluation 

The  Human  Performance  Evaluation  method  is  used  to  identify  factors  that  influence  task 
performance.  It  is  most  frequently  used  for  man-machine  interface  studies.  Its  focus  is  on 
examining  operability  and  work  environment,  rather  than  on  training  operators  to  compensate  for 
bad  conditions.  The  Human  Performance  Evaluation  method  is  extremely  versatile  and  may  be 
used  to  analyze  most  incidents  since  many  conditions  and  situations  leading  to  and  ultimately 
result  from  some  type  of  task  performance  problem.  These  problems  may  involve  an  analysis  of 
planning,  scheduling,  task  assignment,  ergonomics/accessibility  and  instrumentation  interfaces. 
Formal  training  in  ergonomics  and  human  factors  is  needed  to  perform  adequate  human 
performance  evaluations,  especially  in  man-machine  interface  situations.  This  method  can  be 
used  to  investigate  any  type  of  incident  if  warranted.  Appendix  D  discusses  this  method. 

7.2  Applying  Root  Cause  Analysis  Methods 

Use  the  analytical  methods  described  above  by  following  three  primary  steps  which  are 
highlighted  below.  These  steps  are  based  upon  the  use  of  the  most  common  analytical  methods, 
the  Sequential  Events  and  Causal  Factor  Analysis  method  and  the  Cause  &  Effects  Analysis 
method.  Note,  however,  that  if  an  initial  evaluation  of  the  incident  indicates  that  the  problem  is 
very  complex,  that  process  changes  have  played  a  significant  role,  or  that  or  issues  with  personnel 
or  human  factors  are  key  contributing  factors,  then  additional  analysis  methods  should  be  brought 
to  bear. 

Step  E  Determine  the  Causal  Factor  Chain  (the  Sequence  of  Events  and  Causal  Factors). 

(a)  The  first  step  is  to  determine  the  sequence  of  events  and/or  actions  that  led  up  to  the 
incident  as  well  as  their  surrounding  conditions.  An  effective  way  to  document  this  is  by 
using  a  sequential  Events  and  Causal  Factor  Chart  described  in  Appendix  A.  This  type  of 
chart  provides  a  structure  for  investigators  to  organize  and  analyze  the  information 
gathered  during  the  investigation  and  to  identify  gaps  and  deficiencies  in  knowledge  as 
the  investigation  progresses.  Preparation  of  an  Events  and  Causal  Factor  Chart  should 
begin  as  soon  as  the  investigation  starts.  The  chart  is  modified  and  updated  as  the 
relevant  facts  are  uncovered.  This  chart  should  drive  the  data  collection  process  by 
identifying  data  needs. 

(b)  Begin  developing  the  Events  and  Causal  Factor  Chart  by  “working  backwards”  from  the 
incident  and  identifying  the  events/actions  and  their  associated  conditions  that 
immediately  preceded  the  incident. 
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(c)  Continue  to  work  backwards  and  identify  successively  earlier  events/actions  and  their 
associated  conditions.  Eventually,  when  this  process  is  complete,  you  will  be  able  to 
identify  the  root  cause  (the  fundamental  reason  that,  if  corrected,  will  prevent  recurrence 
of  this  and  similar  incidents). 

Step  2.  Analyze  the  Causal  Factor  Chain 

The  next  step  is  to  analyze  the  Causal  Factor  Chain.  (If  the  Change  Analysis  and/or 
Human  Performance  Evaluation  methods  were  used,  these  results  may  be  useful  in  this 
step).  For  purposes  of  this  guide,  it  is  recommended  that  the  Causal  Factor  Chain  be 
analyzed  using  a  Cause  and  Effects  Chart  otherwise  known  as  a  Fishbone  or  Tree 
Diagram.  This  charting  method  generates  and  sorts  hypotheses  about  possible  causes  of 
an  incident  by  asking  participants  to  list  all  of  the  causes  and  effects  related  to  the 
identified  problem.  Causes  and  effects  identified  by  other  analysis  methods,  including 
Sequence  of  Events  and  Causal  Factor  Analysis,  Change  Analysis  and  Human 
Performance  Evaluation,  must  be  incorporated  in  this  step.  The  ultimate  result  is  a 
graphical  display,  organized  by  major  cause  categories,  of  what  is  known  and  unknown 
and  helps  identify  areas  requiring  additional  investigation.  Since  all  conditions  are  a 
result  of  prior  actions,  the  diagram  identifies  what  remaining  questions  to  ask  to  follow 
the  path  to  the  source  or  root  cause.  Appendix  B  provides  guidance  on  generating  a 
Cause  and  Effects  Chart. 


Step  3.  Summarizing  Findings,  List  Causal  Factors 

(a)  Once  the  causal  factors  have  been  sequenced  using  a  Causal  Factor  Chain  Chart  and 
analyzed  using  a  Cause  &  Effects  Chart,  they  can  be  summarized  and  categorized. 

Causal  factor  should  be  categorized  in  two  ways;  in  terms  of  their  source,  and  in  terms  of 
their  hierarchy  with  respect  to  the  incident  being  investigated. 

(b)  Categorizing  causal  factors  in  terms  of  their  source  identifies  general  areas  of  a  system, 
process,  or  component  which  will  required  focus  attention  during  the  development  of 
corrective  actions.  “Cause  Codes”  such  as  those  shown  in  Appendix  E,  should  be  used  to 
categorize  causal  factors  in  terms  of  their  source(s). 

(c)  Causal  factors  may  also  be  categorized  in  terms  of  their  hierarchy  with  respect  to  the 
incident  being  investigated.  Causal  factors  may  be  identified  as  being  contributing, 
direct,  or  root  causes.  Appendix  F  provides  recommended  worksheets  to  identify  the 
hierarchical  position  of  causal  factors. 
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8.  PHASE  Ill  -  CORRECTING 


In  Phase  II,  the  root  cause  analysis  identified  the  causes  to  be  corrected  to  avoid 
recurrence  of  an  incident  and,  ultimately,  to  improve  reliability  and  safety.  In  the  Correcting 
phase  (Phase  III),  effective  corrective  actions  are  selected  and  implemented  based  upon  the 
results  of  the  root  cause  analysis. 

To  begin,  identify  a  potential  corrective  action  for  each  cause;  then  ensure  the  viability  of 
the  potential  corrective  actions  using  the  following  questions.  If  the  potential  corrective  actions 
are  not  viable,  re-evaluate  and  identify  alternate  corrective  actions. 

1.  Will  the  corrective  action  prevent  recurrence? 

2.  Is  the  corrective  action  feasible? 

3.  Does  the  corrective  action  permit  the  primary  objectives  to  be  met  and  the  mission  to 

be  accomplished? 

4.  If  the  corrective  action  introduces  new  risks,  are  those  risks  known  and  acceptable? 

5.  Do  the  corrective  actions  preserve  the  safety  of  other  systems? 

6.  Were  the  immediate  actions  taken  appropriate  and  effective  in  reducing  risk  of 

recurrence? 

7.  Are  sufficient  Policy  Directives  or  Instructions  in  place? 


A  systems  approach  should  be  used  in  determining  appropriate  corrective  actions.  It 
should  consider  not  only  the  impact  they  will  have  on  preventing  recurrence,  but  also  the 
potential  that  the  corrective  actions  may  actually  degrade  safety.  Also,  the  impact  the  corrective 
actions  will  have  on  other  operations  should  be  considered.  The  proposed  corrective  actions  must 
be  compatible  with  facility  commitments  and  other  obligations.  In  addition,  those  affected  by  or 
responsible  for  any  part  of  the  corrective  actions,  including  management,  should  be  involved  in 
the  process.  Proposed  corrective  actions  should  be  reviewed  to  ensure  the  above  criteria  have 
been  met,  and  should  be  prioritized  based  on  importance,  scheduled  (a  change  in  priority  or 
schedule  should  be  approved  by  management),  entered  into  a  commitment  tracking  system,  and 
implemented  in  a  timely  manner.  A  complete  corrective  action  program  should  be  based,  not 
only  on  specific  causes  of  incidents,  but  also  on  items  such  as  lessons  learned  from  other 
facilities,  assessments,  and  personnel  suggestions. 

A  successful  corrective  action  program  requires  involving  management  at  the  appropriate 
level.  Management  must  be  willing  to  take  responsibility  and  allocate  adequate  resources  for 
corrective  actions.  Effective  corrective  action  programs  include  the  following: 

•  Management  emphasis  on  the  identification  and  correction  of  problems  that  can  affect  human 
and  equipment  performance,  including  assigning  qualified  personnel  to  effectively  evaluate 
equipment/human  performance  problems,  implementing  corrective  actions,  and  following  up 
to  verify  corrective  actions  are  effective. 

•  Developing  administrative  procedures  that  describe  the  process,  identify  resources,  and 
assign  responsibility. 

•  Developing  a  working  environment  that  requires  accountability  for  correction  of  impediments 
to  error-free  task  performance  and  reliable  equipment  performance. 
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•  Developing  a  working  environment  that  encourages  voluntary  reporting  of  deficiencies, 
errors,  or  omissions. 

•  Sponsoring  training  programs  for  those  individuals  who  are  expected  to  conduct  root-cause 
analyses. 

•  Training  of  personnel  and  managers  to  recognize  and  report  incidents,  including  early 
identification  of  significant  and  generic  problems. 

•  Developing  programs  to  ensure  prompt  investigation  following  an  incident  or  identification 
of  declining  trends  in  performance  to  determine  root  causes  and  corrective  actions. 

•  Adopting  a  classification  and  trending  system  that  identifies  those  factors  that  continue  to 
cause  problems  with  generic  implications. 

Additional  specific  questions  and  considerations  in  developing  and  implementing  corrective 

actions  include: 

•  Do  the  corrective  actions  address  all  the  causes  (especially  the  root  cause(s))? 

•  What  are  the  consequences  of  not  implementing  the  corrective  actions? 

•  What  is  the  cost  of  implementing  the  corrective  actions  (capital  costs,  operations,  and 
maintenance  costs)? 

•  Will  training  be  required  as  part  of  the  implementation? 

•  In  what  time  frame  can  the  corrective  actions  reasonably  be  implemented? 

•  What  resources  are  required  for  successful  development  of  the  corrective  actions? 

•  What  resources  are  required  for  successful  implementation  and  continued  effectiveness  of  the 
corrective  actions? 

•  What  impact  will  the  development  and  implementation  of  the  corrective  actions  have  on  other 
work  groups? 

•  Is  the  implementation  of  the  corrective  actions  measurable?  (For  example,  “Revise  step  6.2 
of  the  technical  order,  to  include  diagrams,  to  clearly  reflect  the  location  requiring  inspection” 
is  measurable;  “Ensure  the  actions  of  procedure  step  6.2  are  performed  correctly  in  the 
future”,  is  not  measurable). 


18 


9.  PHASE  IV  -  INFORMING 


Effectively  preventing  the  recurrence  of  incidents  requires  the  distribution  of  summary 
reports  and  lessons  learned  to  all  personnel  who  might  benefit.  Identification  of  report  recipients 
depends  upon  the  specific  nature  of  the  incident  being  investigated.  However,  at  a  minimum, 
personnel  directly  associated  with  the  incident  and  those  in  their  direct  reporting  chain  should 
receive  post-investigation  reports.  In  addition,  consideration  should  be  given  to  directly  sharing 
the  details  of  root  cause  information  with  facilities  and  organizations  that  are  engaged  in  similar 
work  or  where  significant  or  long-standing  problems  may  exist. 

For  Air  Force  investigations  of  structural  inspection  related  incidents,  it  is  recommended 
that,  at  a  minimum,  the  following  personnel  and  organizations  be  informed  of  the  investigation 
findings  and  recommended  corrective  actions: 


Affected  Inspection  Facility  Personnel 
Affected  Inspection  Facility  Supervisor  and  Commander 
All  Facilities  Performing  Similar  Inspections 
Weapon  System  Manager 
Weapon  System  ASIP  Manager 

-  Air  Force  NDI  Office  -  AFRF/MFS-OF 

Major  Command  NDI  Functional  (MAJCOM/A4) 

Air  Force  Safety  Center 

Air  Force  ASIP  Manager  -  ASC/EN 

-  HQ  AFMC/A4 

HQ  AFMC/CV  (Type  I  Incidents) 
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10.  PHASE  V  -  VERIFYING 


In  the  Verifying  phase,  the  effectiveness  of  corrective  actions  in  resolving  the  root  and 
contributing  causes  of  the  incident  is  evaluated.  First,  the  corrective  actions  should  be  tracked  to 
ensure  that  they  have  been  properly  implemented  and  are  functioning  as  intended.  Second,  a 
structured  review  of  corrective  action  tracking,  normal  process  and  change  controls,  and  incident 
tracking  should  be  conducted  to  ensure  that  past  corrective  actions  have  been  effectively  handled. 
The  recurrence  of  the  same  or  similar  incidents  must  be  identified  and  analyzed.  If  an  incident 
recurs,  the  original  incident  should  be  re-evaluated  to  determine  why  corrective  actions  were 
ineffective.  Also,  the  new  incident  should  be  investigated  using  the  Change  Analysis  method. 
Process  change  controls  should  be  evaluated  to  determine  what  improvements  are  needed  to  keep 
up  with  changing  conditions.  Early  indications  of  deteriorating  conditions  can  be  obtained  from 
tracking  and  trend  analyses  of  incident  information.  Prompt  corrective  actions  should  be  taken  to 
reverse  deteriorating  conditions  or  to  apply  lessons  learned. 
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APPENDIX  A 


SEQUENTIAL  EVENTS  AND  CAUSAL  FACTOR  ANALYSIS 

The  Sequential  Events  and  Causal  Factor  Analysis  consist  of  two  primary  actions: 

1)  The  Sequential  Events  Analysis 

2)  Events  and  Causal  Factors  Charting 

A1.0  SEQUENTIAL  EVENTS  ANALYSIS 

Sequential  Events  (Walk-through)  Analysis  is  a  method  in  which  personnel  conduct  a  step-by- 
step  reenactment  of  their  actions  for  the  observer  without  carrying  out  the  actual  function. 

Objectives  include: 

•  Determining  how  a  task  was  really  performed. 

•  Identifying  problems  in  human-factors  design,  discrepancies  in  procedural  steps, 
training,  etc. 

Preconditions  are  that  participants  must  be  the  people  who  actually  do  the  task. 

Steps  in  Sequential  Events  and  Causal  Factor  are  as  follows: 

Step  1 .  Obtain  preliminary  information  so  you  know  what  the  person  was  doing  when  the 
problem  or  inappropriate  action  occurred. 

Step  2.  Decide  on  a  task  of  interest. 

Step  3.  Obtain  necessary  background  information: 

•  Obtain  relevant  procedures. 

•  Obtain  system  drawings  if  required,  etc. 

•  Interview  personnel  who  have  performed  the  task  (but  not  those  who  will  be 
observed)  to  obtain  understanding  of  how  the  task  should  be  performed. 

Step  4.  Produce  a  guide  outlining  how  the  task  will  be  performed.  In  the  case  of  a 
maintenance  action,  the  maintenance  procedure  or  technical  order,  with  key  items  underlined, 
is  the  easiest  way  of  doing  this.  The  guide  should  indicate  steps  in  performing  the  task  and 
key  controls  and  displays  so  that: 

•  You  will  know  what  to  identify. 

•  You  will  be  able  to  record  actions  more  easily. 

Step  5.  Thoroughly  familiarize  yourself  with  the  guide  and  decide  exactly  what  information 
you  are  going  to  record  and  how  you  will  record  it. 

You  may  want  to  check  off  each  step  and  controls  or  displays  used  as  they  occur. 
Discrepancies  and  problems  may  be  noted  in  the  margin  or  in  a  space  provided  for 
comments,  adjacent  to  the  step. 
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Step  6.  Select  personnel  who  normally  perform  the  task.  If  the  task  is  performed  by  multiple 
individuals,  the  individuals  should  play  the  same  role  they  fulfdl  when  performing  the  task. 
Observe  personnel  walking  through  the  task  and  record  their  actions  and  use  of  displays  and 
controls.  Note  discrepancies  and  problem  areas. 

Step  7.  You  should  observe  the  task  as  it  is  normally  performed;  however,  if  necessary,  you 
may  stop  the  task  to  gain  full  understanding  of  all  steps.  Conducting  the  task  as  closely  to  the 
conditions  that  existed  when  the  event  occurred  will  provide  the  best  understanding  of  the 
event  causal  factors. 

Step  8.  Summarize  and  consolidate  any  problem  areas  noted.  Identify  probable  contributors 
to  the  event. 

Step  9.  Generate  a  Events  and  Causal  Factors  Chart  to  assist  in  a  logical  assessment  of  the 
findings.  From  the  Events  and  Causal  Factors  Chart  identify  the  most  probable  causes. 
Guidance  for  Events  and  Causal  Factor  Charting  are  provided  in  the  following  section  2.0. 

A2.0  EVENTS  AND  CAUSAL  FACTOR  CHARTING 

(Adapted  from  the  DOE  publication  SCIE-DOE-Ol-TRAC-14-95,  Events  and  Causal  Factor 
Analysis) 

A2A  What  is  an  Events  and  Causal  Factor  Chart? 

The  Events  and  Causal  Factors  (ECF)  chart  depicts  the  necessary  and  sufficient  events  and  causal 
factors  for  an  incident  in  a  logical  sequence.  It  can  be  used  not  only  to  analyze  the  incident  and 
evaluate  the  evidence  during  investigation,  but  also  can  help  validate  the  accuracy  of  pre-incident 
systems  analyses. 

The  primary  purpose  of  the  investigation  is  to  determine  what  happened  and  why  it  happened  in 
order  to  prevent  similar  incidents  and  to  improve  the  safety  and  efficiency  of  future  operations. 
When  serious  incidents  occur,  they  are  often  symptomatic  of  systemic  deficiencies  which  also 
impair  performance  and  production.  When  the  incident  is  used  as  a  window  through  which  to 
view  the  existing  management  system,  these  deficiencies  are  revealed  and  benefits  are  derived 
which  go  far  beyond  correction  of  the  immediate  causes  of  the  incident.  The  emphasis,  then, 
should  be  placed  on  discovering  all  cause-effect  relationships  from  which  practical  corrective 
actions  can  be  derived  to  improve  total  performance.  The  intent  of  the  investigation,  then,  is 
not  to  place  blame,  but  rather  to  determine  how  responsibilities  can  be  clarified  and  how 
loss-producing  errors  can  be  reduced  and  controlled.  Accurate  ECF  charts  can  help  satisfy 
these  general  purposes  by: 

>  providing  a  cause-oriented  explanation  of  the  accident; 

>  providing  a  basis  for  beneficial  changes  to  prevent  future  occurrences  and  operational 
errors; 

>  helping  delineate  areas  of  responsibility; 

>  helping  assure  objectivity  in  the  conduct  of  the  investigation; 

>  organizing  quantitative  data  (e.g.,  time,  velocity,  temperature,  etc.)  Related  to  events 
and  conditions; 

>  acting  as  an  operational  training  tool; 

>  providing  an  effective  aid  to  future  systems  design. 
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More  specifically,  an  ECF  chart: 

>  aids  in  developing  evidence,  in  detecting  all  causal  factors  through  sequence 
development,  and  in  determining  the  need  for  in-depth  analysis; 

>  clarifies  reasoning; 

>  illustrates  multiple  causes.  As  previously  stated,  safety  incidents  rarely  have  a  single 
“cause”.  Charting  helps  illustrate  the  multiple  causal  factors  involved  in  the  accident 
sequence,  as  well  as  the  relationship  of  proximate,  remote,  direct,  and  contributory 
causes; 

>  visually  portrays  the  interactions  and  relationships  of  all  involved  organizations  and 
individuals;  illustrates  the  chronology  of  events  showing  relative  sequence  in  time; 

>  provides  flexibility  in  interpretation  and  summarization  of  collected  data; 

>  conveniently  communicates  empirical  and  derived  facts  in  a  logical  and  orderly 
manner; 

>  links  specific  incident  factors  to  organizational  and  management  control  factors. 

Sections  A2.2  and  A2.3  provide  a  set  of  conventions  and  criteria  to  be  used  in  ECF  charting. 
These  conventions  are  intended  to  improve  comparability  and  consistency  in  incident  reporting 
and  to  assist  the  communication  of  investigation  findings.  These  conventions  are  intended  to  be 
as  simple  as  possible  while  preserving  the  effectiveness  of  ECF  charts.  It  is  further  intended  that 
investigators  be  provided  with  helpful  guidelines  without  inhibiting  their  use  of  this  tool  by 
imposing  an  overly  complex  set  of  rules.  In  section  A3.0,  more  general  guidelines  are  given  for 
the  performance  of  the  Sequential  Events  and  Causal  Factor  Analysis  method.  An  example 
Events  and  Causal  Factor  Charting  is  provided  in  section  A3.1 

A2.2  Conventions  for  Events  and  Causal  Factors  Charts 

(a)  Events  should  be  enclosed  in  rectangles,  and  conditions  in  ovals. 


EVENTS 


(b)  Events  should  be  connected  by  solid  arrows. 


(c)  Conditions  should  be  connected  to  each  other  and  to  events  by  dashed  arrows. 


24 


(d)  Each  event  and  condition  should  either  be  based  upon  valid  factual  evidence  or 
be  clearly  indicated  as  presumptive  by  dashed  line  rectangles  and  ovals. 


I 

;  ! 

_ j 

(e)  The  primary  sequence  of  events  should  be  depicted  in  a  straight  horizontal  line 
(or  lines  in  confluent  or  branching  primary  chains)  with  events  joined  by  bold  printed 
connecting  arrows. 


(f)  Secondary  event  sequences,  contributing  factors,  and  systemic  factors  should  be 
depicted  on  horizontal  lines  at  different  levels  above  or  below  the  primary  sequence. 


(g)  Events  should  be  arranged  chronologically  from  left  to  right. 


(h)  Events  should  track  in  logical  progression  from  the  beginning  to  the  end  of  the  event 
sequence  and  should  include  all  pertinent  incidents.  This  necessitates  that  the  beginning  and  the 
end  be  defined  for  each  incident  sequence.  Analysts  frequently  use  the  incident  as  the  key  event 
and  proceed  from  it  in  both  directions  to  reconstruct  the  pre-incident  and  post-incident  ECF 
sequences. 
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A2.3  Suggested  Criteria  for  Event  Descriptions  and  Conditions 

(a)  Each  event  should  describe  an  incident  or  happening  and  not  a  condition,  state, 
circumstance,  issue,  conclusion,  or  result;  i.e.,  “Aircraft  was  grounded  due  to  large  crack  in  wing 
caused  a  fuel  leak”. 

(b)  Each  event  should  be  described  by  a  short  sentence  with  one  subject  and  one  active 
verb;  i.e.,  “technician  performed  visual  inspection  of  landing  gear  trunnion”,  not  “technician 
wiped  down  the  landing  gear  trunnion  with  solvent  and  then  performed  visual  inspection”. 

(c)  Each  event  should  be  precisely  described;  i.e.,  “the  technician  adjusted  the  inspection 
frequency  to  20kHz  ‘on’  position”,  not  “the  technician  selected  the  inspection  frequency”. 

(d)  Each  event  should  describe  a  single,  discrete  incident;  i.e.,  “flap-track  liberated  from 
airframe”,  not  “flap-track  liberated  from  aircraft  and  FODed  engine”. 

(e)  Each  event  should  be  quantified  when  possible;  i.e.,  “plane  descended  350  feet”,  not 
“plane  lost  altitude.” 

(f)  Each  event  should  be  derived  directly  from  the  event  (or  events  in  the  case  of  a 
branched  chain)  and  conditions  preceding  it;  i.e.,  “inspector  identified  inspection  location  by 
“feel”  using  right  hand”  is  preceded  by  “inspector  crawled  into  fuel  cell  to  gain  access”  which  is 
preceded  by  “the  crew  chief  removed  fuel  cell  access  panel  IAW  TO”  -  each  event  deriving 
logically  from  the  one  preceding  it.  When  this  is  not  the  case,  it  usually  indicates  that  one  or 
more  steps  in  the  sequence  have  been  left  out. 

(g)  Conditions  differ  from  events  insofar  as  they  (a)  describe  states  or  circumstances 
rather  than  happenings  or  incidents  and  (b)  are  passive  rather  than  active.  As  far  as  practical, 
conditions  should  be  precisely  described,  quantified  when  possible,  posted  with  time  and  date 
when  possible,  and  be  derived  directly  from  the  conditions  immediately  preceding  them. 

A3.0  Guidelines  for  Practical  Application 

The  experience  of  many  people  participating  in  numerous  accident  investigations  has  led  to  the 
identification  of  seven  key  elements  in  the  practical  application  of  Sequential  Events  and  Causal 
Factor  Analysis  to  achieve  high  quality  accident  investigations. 

(a)  Begin  early,  as  soon  as  you  start  to  accumulate  factual  information  on  events  and 
conditions  related  to  the  incident,  begin  construction  of  a  “working  chart”  of  events  and  causal 
factors.  It  is  often  helpful  also  to  rough  out  a  fault  tree  of  the  incident  to  establish  how  the 
accident  could  have  happened.  This  can  prevent  false  starts  and  ‘wild  goose  chases”  but  must  be 
done  with  caution  so  that  you  don’t  lock  yourself  into  a  preconceived  model  of  the  incident. 

(b)  Remember  to  keep  the  proper  perspective  in  applying  these  guidelines;  they  are 
intended  to  guide  you  in  the  simple  application  of  a  valuable  investigative  tool.  They  are  not 
exact  rules  that  must  be  applied  without  question  or  reason.  If  you  have  a  truly  unique  situation 
and  feel  that  you  must  deviate  from  the  guidelines  for  clarity  and  simplicity,  do  it.  Analytical 
techniques  should  be  servants  not  masters. 

(c)  Proceed  logically  with  available  data.  Events  and  causal  factors  usually  do  not 
emerge  during  the  investigation  in  the  sequential  order  in  which  they  occurred.  Initially,  there 
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may  be  many  holes  and  deficiencies  in  the  chart.  Efforts  to  fill  these  holes  and  get  accurate 
tracking  of  the  event  sequences  and  their  derivation  from  contributing  conditions  will  lead  to 
deeper  probing  by  investigators  that  will  uncover  the  true  facts  involved.  In  proceeding  logically, 
using  available  information  to  direct  the  search  for  more,  it  is  usually  easiest  to  use  the  incident  as 
the  starting  point  and  reconstruct  the  pre-accident  and  post-accident  sequences  from  that  vantage 
point. 


(d)  Use  an  easily  updated  format.  As  additional  facts  are  discovered  and  analyses  of 
those  facts  further  identify  causal  factors,  the  working  chart  will  need  to  be  updated.  Unless  a 
format  is  selected  which  displays  the  emerging  information  in  an  easily  modified  form, 
construction  of  the  chart  can  be  very  repetitious  and  time-consuming.  Successive  redrafts  of  the 
ECF  chart  on  large  sheets  of  paper  have  been  done;  magnetic  display  boards  or  chalkboards  have 
been  used;  but  the  technique  that  has  consistently  proven  most  effective  and  most  easily  updated 
is  use  of  “post-it”  notes  on  which  brief  event  or  condition  statements  are  written.  A  single  event 
or  condition  is  written  on  each  note.  The  notes  are  then  stuck  to  a  wall  or  a  large  roll  of  heavy 
paper  in  the  sequence  of  events  as  then  understood.  As  more  information  is  revealed,  notes  can 
be  rearranged,  added,  or  deleted  to  produce  a  more  complete  and  accurate  version  of  the  working 
chart.  Once  the  note-based  working  chart  has  been  finalized,  the  ECF  chart  can  be  drawn  for 
inclusion  in  the  investigation  report. 

(e)  Select  the  appropriate  level  of  detail  and  sequence  length  for  the  ECF  chart.  The 
incident,  itself,  and  the  depth  of  investigation  specified  by  the  investigation  commissioning 
authority  will  often  suggest  the  amount  of  detail  desired. 

(f)  Make  a  short  executive  summary  chart  when  necessary.  The  ECF  working  chart  will 
contain  much  detail  so  it  can  be  of  greatest  value  in  shaping  and  directing  the  investigation.  In 
general,  significantly  less  detail  is  required  in  the  ECF  chart  presented  in  the  investigation  report, 
because  the  primary  purpose  is  to  provide  a  concise  and  easy-to-follow  orientation  to  the  accident 
sequence  for  the  report  reader.  When  a  detailed  ECF  chart  is  felt  to  be  necessary  to  show 
appropriate  relationships  in  the  analysis  section  of  an  appendix  of  the  report,  an  executive 
summary  chart  of  only  one  or  two  pages  should  be  prepared  and  included  in  the  report  to  meet  the 
above  stated  purpose. 

A3.1  Events  &  Causal  Factors  Chart  Example 

Application  of  the  suggested  format  and  event  description  criteria  for  constructing  a  typical  ECF 
chart  of  a  simple  inspection  incident  are  illustrated  in  the  following  example. 

Incident  Description 

A  30  day  time  compliance  technical  order  (TCTO)  was  released  requiring  the  eddy  current 
inspection  of  the  wing  flap  track  on  a  dual  engine  US  Air  Force  aircraft.  The  TCTO  tech  data 
was  successfully  validated  by  depot  NDI  and  structural  engineering  prior  to  TCTO  release.  The 
TCTO  was  scheduled  to  be  accomplished  on  21  December  2004,  on  A/C  90-678  prior  to  a 
scheduled  7:00  AM  training  mission.  The  aircraft  was  grounded  pending  successful  performance 
of  the  TCTO.  The  inspector  assigned  to  the  inspection  task  was  late  arriving  to  work.  No  other 
inspection  personnel  were  available  at  the  time  due  to  holiday  vacations.  To  expedite  flight 
preparations,  the  crew  chief  rolled  the  aircraft  out  of  the  hangar  at  8:00  AM  to  perform  an  engine 
check.  At  9:35  AM  the  inspector  arrives  at  work  and  is  assigned  the  task  by  his  supervisor.  At 
10:00  AM  the  inspector  arrives  at  the  aircraft  and  performs  the  inspection.  The  outdoor  ambient 
temperature  was  25°F  with  light  winds  at  the  time  of  the  inspection.  The  indoor  ambient 
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condition  where  the  inspection  equipment  was  stored  was  65°F.  On  27  February  2005,  A/C  90- 
678,  experienced  an  in-flight  failure  of  the  left  engine,  declared  an  in-flight  emergency  and 
landed  safely  at  a  local  municipal  airport. 

A  subsequent  failure  analysis  revealed  that  the  left-wing  flap-track  failed,  resulting  in  the  in-flight 
liberation  of  a  mounting  bracket.  The  bracket  was  ingested  by  the  engine,  resulting  in  FOD 
damage  and  engine  failure.  Laboratory  analysis  indicted  the  flap  track  failed  due  to  fatigue. 
Analysis  also  indicates  that  the  crack  was  between  a  NDI  and  <  acri,icai(a  NDi  but  <  flc,7ftca/ )  was 
present  during  the  previous  inspection. 

The  resulting  investigation  revealed  that  the  inspector  apparently  elected  to  ignore  TCTO 
requirements  for  aircraft  hangaring  during  inspection.  Furthermore  the  inspector  did  not  allow 
inspection  equipment,  probes  and  reference  standard  to  reach  ambient  conditions  prior  to 
beginning  the  inspection.  Failure  to  due  so  resulted  in  excessive  instrument  drift  and 
misinterpretation  of  inspection  results.  The  causal  factors  for  the  undetected  crack  in  the  flap 
were  a)  failure  of  the  inspector  to  follow  technical  data  and  b)  failure  of  management  to  ensure 
sufficient  personnel  resources  to  address  mission  needs. 

Discussion 

Figure  A1  is  the  ECF  chart  of  this  incident.  Note  that  the  events  are  in  chronological  order,  that 
each  follows  logically  from  the  one  preceding  and  that  the  dates  are  indicated  where  known. 
Events  are  enclosed  in  rectangles  and  the  conditions  in  ovals.  Event  statements  are  characterized 
by  single  subjects  and  active  verbs.  Primary  events  are  connected  by  bold  solid  lines,  other 
events  by  solid  lines,  and  conditions  by  dashed  lines.  Presumptive  information  (i.e.,  the  inference 
is  clear  but  the  evidence  is  lacking)  is  shown  in  ovals  and  rectangles  drawn  in  dashed  lines. 

Note:  The  ECF  technique  can  provide  a  clear  illustration  of  the  timeline  of  conditions  and  events, 
as  well  as  help  to  identify  significant  causal  factors.  However,  use  of  this  technique  alone  may 
result  in  the  failure  to  identify  other  significant  causal  factors  which  may  be  uncovered  with 
application  of  additional  RCA  methods. 
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Crew  chief  decides  to 


Figure  Al.  Events  &  Causal  Factors  Chart  Example 
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APPENDIX  B 


CAUSE  AND  EFFECT  ANALYSIS 

(Adapted  from  Basic  Tools  for  Process  Improvement) 

Cause  and  Effect  Analysis  is  a  root  cause  analysis  method  generates  and  sorts  hypotheses  about 
possible  causes  by  listing  all  possible  causes  and  effects. 

Once  the  data  gathering  and  initial  analysis  is  complete  then  the  possible  causes  and  effects  can 
be  graphically  organized.  A  Cause  and  Effect  Diagram  is  a  tool  that  helps  quickly  identify,  sort 
and  display  possible  causes  of  a  specific  problem.  It  graphically  illustrates  the  relationships 
between  a  given  outcome  and  all  the  factors  that  influence  the  outcome.  This  type  of  diagram  is 
sometimes  called  an  “Ishikawa  diagram”  because  it  was  invented  by  Kaoru  Ishikawa,  or  a 
“Fishbone  Diagram”  because  of  the  way  it  looks.  This  diagram  should  incorporate  the  resulting 
findings  from  the  Sequential  Event  and  Causal  Factor  Analysis  as  well  as  the  Change  Analysis 
and  Human  Performance  Evaluations  if  performed. 

When  should  you  use  a  Cause  and  Effect  Diagram? 

A  Cause  and  Effect  Diagram  can  help  when  you  need  to: 

o  Identify  the  possible  root  causes,  the  basic  reasons,  for  a  specific  effect,  problem  or 
condition. 

o  Sort  out  and  relate  some  of  the  interactions  among  the  factors  affecting  a  particular 
process  or  effect. 

o  Analyze  existing  problems  so  that  corrective  action  can  be  taken. 

Why  Use  a  Cause  and  Effect  Diagram? 

A  Cause  and  Effect  Diagram  is  a  tool  useful  for  identifying  and  organizing  the  known  or  possible 
causes.  The  structure  provided  by  the  diagram  helps  the  team  members  think  systematically. 
Constructing  a  Cause-and-Effect  Diagram  will  help  in: 

o  Determining  the  root  cause  of  a  problem  or  quality  characteristic  using  a  structured 
approach. 

o  Encouraging  group  participation  and  utilizes  group  knowledge  of  the  process, 
o  Using  an  orderly,  easy-to-read  format  to  diagram  cause-and-effect  relationships, 
o  Indicating  possible  causes  of  variations  in  a  process. 

o  Increasing  knowledge  of  the  process  by  helping  everyone  learn  more  about  the 
factors  at  work  and  how  they  relate, 
o  Quickly  eliminating  non-causal  issues. 

o  Identifying  areas  where  data  should  be  collected  for  further  study. 

How  to  Develop  a  Cause-and-Effect  Diagram 

When  you  develop  a  Cause-and-Effect  Diagram,  you  are  constructing  a  structured,  pictorial 
display  of  a  list  of  causes  organized  to  show  their  relationship  to  a  specific  effect.  The  data  will 
be  analyzed  to  identify  the  causal  factors,  summarizing  the  findings,  and  categorizing  the  findings 
by  the  cause  categories.  The  major  cause  categories  are: 
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•  Equipment/Material  Deficiency 

•  Procedure  Deficiency 

•  Personnel  Deficiency 

•  Design/Engineering  Deficiency  (includes  incorrect  assumptions) 

•  Training  Deficiency 

•  Management/Organizational  Deficiency 

•  External  Phenomenon 

These  categories  have  been  selected  with  the  intent  to  address  all  problems  that  could  arise  while 
conducting  inspection  and  maintenance  operations.  The  first  three  categories  are  necessary  to 
perform  any  task  (equipment,  procedures  and  personnel).  Design  and  training  determine  the 
quality  and  effectiveness  of  equipment  and  personnel.  These  elements  must  be  managed; 
therefore,  management  is  also  a  necessary  element  as  it  provides  the  priorities  and  resources  to 
support  requirements.  Whenever  there  is  an  incident,  one  of  these  six  program  elements  was 
inadequate  to  prevent  the  incident.  (An  external  factor  beyond  operational  control  serves  as  a 
seventh  cause  category).  These  causal  factors  can  be  associated  in  a  logical  causal  factor  chain  as 
shown  in  Figure  Bl.  (Note  that  a  direct,  contributing,  or  root  cause  can  occur  any  place  in  the 
causal  factor  chain;  that  is,  a  root  cause  can  be  an  operator  error  while  a  management  problem  can 
be  a  direct  cause,  depending  on  the  nature  of  the  incident).  These  seven  cause  categories  are 
subdivided  into  a  total  of  32  subcategory  cause  code.  The  direct  cause,  contributing  causes,  and 
root  cause  are  all  selected  from  these  subcategories  (see  Appendix  E). 

The  steps  for  constructing  and  analyzing  a  Cause-and-Effect  Diagram  are  as  follows: 

Step  1  -  Identify  and  clearly  define  the  outcome  or  EFFECT  to  be  analyzed. 

o  Decide  on  the  effect  to  be  examined.  Effects  are  stated  as  a  fault  conditions  or  the 
result  or  outcome  of  a  fault  condition. 

o  Use  Operational  Definitions.  Develop  an  Operational  Definition  of  the  effect  to 
ensure  that  it  is  clearly  understood. 

Step  2  -  Draw  the  SPLINE  and  create  the  EFFECT  box. 

o  Draw  a  horizontal  arrow  pointing  to  the  right.  This  is  the  spline, 
o  To  the  right  of  the  arrow,  write  a  brief  description  of  the  effect. 

The  following  example  will  diagram  the  causes  relating  to  a  fatigue  crack  missed  during 
an  inspection  of  an  aircraft  flap  track.  Therefore,  the  EFFECT  is  Missed  Fatigue  Crack 
in  Flap  Track  (See  Figure  Bl) 

o  Draw  a  box  around  the  description  of  the  effect. 


*- 


MISSED 

FATIGUE  CRACK 
IN  FLAP  TRACK 


Figure  Bl.  Spline  Drawn  to  the  Effect 


31 


Step  3  -  Identify  the  main  CAUSES  contributing  to  the  effect  being  studied.  These  are  the  labels 
for  the  major  branches  of  your  diagram  and  become  categories  under  which  to  list  the  main 
causes  related  to  those  categories. 

o  Establish  the  main  causes,  or  categories,  under  which  other  possible  causes  will  be 
listed.  Y ou  should  use  category  labels  that  make  sense  for  the  diagram  you  are 
creating.  It  is  recommended  that  the  cause  categories  defined  in  Appendix  E  be  used. 
These  categories  are  as  follows: 

Equipment/Materials 

Procedures 

Personnel 

Design/Engineering 

Training 

Management/Organization 
External  Phenomenon 

o  Write  the  main  categories  your  team  has  selected  to  the  left  of  the  effect  box,  some 
above  the  spline  and  some  below  it. 

o  Draw  a  box  around  each  category  label  and  use  a  diagonal  line  to  form  a  branch 
connecting  the  box  to  the  spline  (See  Figure  B2). 


Figure  B2.  Main  Cause  Categories  Connected  by  a  Spline  to  the  Effect 

Step  4  -  For  each  major  branch,  identify  other  specific  factors  which  may  be  the  CAUSES  of  the 
EFFECT. 

o  Identify  as  many  causes  or  factors  as  possible  and  attach  them  as  sub-branches  of  the 
major  branches. 

EXAMPLE:  The  possible  CAUSEs  for  the  Missed  Fatigue  Crack  in  Flap  Track  are 
listed  under  the  appropriate  categories  in  Figure  B3. 
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o  Fill  in  detail  for  each  cause.  If  a  minor  cause  applies  to  more  than  one  major  cause, 
list  it  under  both. 


Figure  B3.  Cause  and  Effects  Diagram  -  First  Level  Causes 


Step  5  -  Identify  increasingly  more  detailed  levels  of  cause  and  continue  organizing  them  under 
related  causes  or  categories.  You  can  do  this  by  asking  a  series  of  questions. 

EXAMPLE:  Use  a  series  of  why  questions  to  fill  in  the  detailed  levels  for  one  of  the  causes  listed 
under  each  of  the  main  categories. 

PROCEDURES 

Q:  Why  were  the  INSTRUCTIONS  INCOMPLETE? 

A:  Poor  definition  of  inspection  zone. 

Q:  Why  was  the  inspection  zone  poorly  defined? 

A:  Procedures  not  validated. 

EQUIPMENT 

Q:  Why  was  the  WRONG  PROBE  used? 

A:  Instructions  Ignored. 

Q:  Why  were  the  instructions  ignored? 

A:  Inspector  inexperience. 

A:  Inspector  unprepared/pressured. 

A:  Correct  probe  failed  and  not  available. 

Q:  Why  was  a  replacement  probe  not  available? 

A:  No  $  for  replacement. 

PERSONNEL 

Q:  Why  was  the  inspector  RUSHED? 

A:  Always  late  to  work. 

A:  Command  pressure  -  Wing  Commander  angry  due  to  mission  delay 

Figure  B4  shows  how  the  diagram  looks  when  all  the  contributing  causes  that  were  identified  by 
the  series  of  why  questions  have  been  filled  in.  As  you  can  see,  there  may  be  many  levels  of 
causes  contributing  to  the  effect. 
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NOTE:  You  may  need  to  break  the  diagram  into  smaller  diagrams  if  one  branch  has  too  many 
sub-branches.  Any  main  cause  can  be  reworded  into  an  effect. 


1.0  Procedures 


Instructions 
incomplete 

Not  validated - 

Poor  definition  of 
inspection  zone 


2.0  Equipment 


Poor  technique 


Inadequate 
experience/trainini 


Correct  probe  not 
identified 

Improperly 
delegated  task 

—  No  Level  III 

Authored  by 
unqualified  tech 


Equipment  temp 
unstable 


Unhangared 

inspection 


Instrument  drift 


Correct  probe  failed 

No  $  for  replacement 

.Wrong  probe  used 

Inspector  inexperience 

◄ -  Inspector 

unprepared/pressured 

Instructions 
ignored 


Always  late  Poor  personnel 
to  work  management 

Command 

pressure  Poor  scheduMng 


Rushed 


No  accountability 


MISSED 

FATIGUE  CRACK 
IN  FLAP  TRACK 


No  $ 


Inadequate  resource  allocation 

\ 

Wrong  priorities 


3.0  Personnel 


4.0  Management 


Figure  B4.  Cause  and  Effects  Diagram  -  With  Multiple  Contributing  Causes 


Step  6  -  Assign  a  numeric  code  for  each  of  the  causes  and  minor  causes.  One  approach  is  shown 
in  the  following  example: 

EXAMPLE:  See  Figure  B5.  Note:  Cause  level  expanded  further 
3.0  EQUIPMENT 

3.1  Instrument  drift 

3.1.1  Equipment  temp  unstable 

3.1.1.  lUnhangared  Inspection 

3. 1.1. 1.1  Inadequate  procedures 

3 . 1 . 1 . 1 .2  Instructions  ignored 

3.2  Wrong  Probe  Used 

3.2.1  Correct  probe  failed 

3.2. 1 . 1  No  $  for  replacement 

3.2.2  Instructions  ignored 

3.2.2. 1  Inadequate  experience/training 

3. 2.2. 2  Inspector  unprepared/pressured 

On  a  spreadsheet,  for  each  cause,  list  the  major  cause  and  each  minor  cause  and  two  ways  you 
know  it  to  be  true.  If  only  one  way  is  known  or  not  firm,  all  possible  causes  should  be  evaluated 
as  potential  causes,  and  the  bases  for  rejected  and  accepted  causes  should  be  stated. 

All  possible  minor  causes  must  be  validated  as  either  true  or  untrue.  If  sufficient  evidence 
indicates  the  minor  cause(s)  are  not  true  then  they  should  be  eliminated  as  possible  contributors. 
Color  coding  the  Cause  and  Effect  Diagram  and  supporting  spreadsheet  may  be  helpful. 
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RED  -  PROVEN  UNTRUE 
GREEN  -  PROVEN  TRUE 

-  FURTHER  ANALYSIS  REQUIRED 


2  Inadequate  resource  allocation 


2.0  Equipment 

Supporting  or  Refuting  Evidence 

2.1  Instrument  Drift 

•  During  interview  inspector  reported  unexplained  instrument  draft 

2.1.1  Equipment  temp  not  stable 

•  Recreation  of  inspection  ambient  conditions  did  not  result  in  level  of  drift  reported  by 
inspector  -  further  test  required 

2. 1.1.1  Unhangered  Inspection 

•  Interview  with  inspector  indicated  inspection  was  conducted  on  flightline  (25°  F). 

•  Inspector  indicated  he  did  not  follow  T.O.  Chapter  1  requirements  for  allow 
instrument/standard  temp  to  reach  ambient. 

2.2  Wrong  probe  used 


Investigation/interview  indicated  unshielded  probe  was  used.  Shielded  probe  required 
when  inspection  around  ferromagnetic  bushings. 


22.2.2  Inspector 
unprepared/pressured 


Interviews  with  inspector  and  management  identified  concerns  with  tardiness  and  poor 
moral  affecting  performance.  Excessive  command  pressure  indicated. 


Figure  B5.  Example  of  color  coded  and  numerically  ordered  Cause  and  Effect 
Diagram  and  supporting  spreadsheet.  Only  the  Equipment  breakdown  is  shown  in  the 
supporting  spreadsheet. 


35 


Step  7  -  Analyze  the  diagram.  Analysis  helps  identify  causes  that  warrant  further  investigation. 
Since  Cause-and-Effect  Diagrams  identify  only  possible  causes,  you  may  want  to  use  a  Pareto 
Chart  to  help  determine  the  cause  to  focus  on  first. 

o  Look  at  the  “balance”  of  your  diagram,  checking  for  comparable  levels  of  detail  for 
most  of  the  categories. 

>  A  thick  cluster  of  items  in  one  area  may  indicate  a  need  for  further  study. 

>  A  main  category  having  only  a  few  specific  causes  may  indicate  a  need  for 
further  identification  of  causes. 

>  If  several  major  branches  have  only  a  few  sub-branches,  you  may  need  to 
combine  them  under  one  category. 

o  Look  for  causes  that  appear  repeatedly.  These  may  represent  root  causes, 
o  Look  for  what  you  can  measure  in  each  cause  so  you  can  quantify  the  effects  of  any 
changes  you  make. 

o  Most  importantly,  clearly  identify  the  causes  you  can  take  action  on. 

When  this  process  gets  to  the  point  where  a  cause  can  be  corrected  to  prevent  recurrence  in  a  way 
that  allows  meeting  your  objectives  and  is  within  your  control,  you  have  found  the  root  cause  or 
causes. 

Alternate  Approach  for  Charting  Cause  and  Effect 

An  alternate  approach  for  graphically  depicting  the  same  information  presented  in  the  Lishbone 
Diagram  is  illustrated  in  LigureB6.  The  chart  illustrates  an  inverted  “tree”  diagram  which  may  be 
useful  for  clearly  briefing  the  results  of  the  RC  A  to  management. 

Through  examination  of  Ligures  B5  and  B6  one  can  quickly  identify  areas  of  concern  and  isolate 
the  most  probable  contributing,  direct  and  root  causes.  Lor  example,  the  PROCEDURE  branch 
is  completely  red  and  therefore  cannot  be  the  location  of  the  contributing  or  root  causes. 

However,  the  other  three  categories  (EQUIPMENT,  PERSONNEL  and  MANAGEMENT) 
contain  “greens”  which  are  candidate  locations  for  contributing  and  root  causes. 
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Missed  Fatigue  Crack  In  Flap  Track 


1.0  Procedures 


2.0  Equipment 


3.0  Personnel 


4.0  Management 


2. 2. 2. 2  Inspector 
unprepared/pressured 


Figure  B6.  Example  of  Cause  and  Effect  Tree  Diagram 
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APPENDIX  C 


CHANGE  ANALYSIS 


Change  Analysis  looks  at  a  problem  by  analyzing  the  deviation  between  what  is  expected  and 
what  actually  happened.  The  evaluator  essentially  asks  what  differences  occurred  to  make  the 
outcome  of  this  task  or  activity  different  from  all  the  other  times  this  task  or  activity  was 
successfully  completed. 

This  technique  consists  of  asking  the  questions:  What?  When?  Where?  Who?  How?  Answering 
these  questions  should  provide  direction  toward  answering  the  root  cause  determination  question: 
Why? 

Primary  and  secondary  questions  included  within  each  category  will  provide  the  prompting 
necessary  to  thoroughly  answer  the  overall  question.  Some  of  the  questions  will  not  be  applicable 
to  any  given  condition.  Some  amount  of  redundancy  exists  in  the  questions  to  ensure  that  all 
items  are  addressed. 

Several  key  elements  include  the  following: 

•  Consider  the  event  containing  the  undesirable  consequences. 

•  Consider  a  comparable  activity  that  did  not  have  the  undesirable  consequences. 

•  Compare  the  condition  containing  the  undesirable  consequences  with  the  reference 
activity. 

•  Set  down  all  known  differences  whether  they  appear  to  be  relevant  or  not. 

•  Analyze  the  differences  for  their  effects  in  producing  the  undesirable  consequences.  This 
must  be  done  with  careful  attention  to  detail,  ensuring  that  obscure  and  indirect 
relationships  are  identified  (e.g.,  a  change  in  color  or  finish  may  change  the  heat  transfer 
parameters  and  consequently  affect  system  temperature). 

•  Integrate  information  into  the  investigative  process  relevant  to  the  causes  of,  or  the 
contributors  to,  the  undesirable  consequences. 

Change  Analysis  is  a  good  technique  to  use  whenever  the  causes  of  the  condition  are  obscure, 
you  do  not  know  where  to  start,  or  you  suspect  a  change  may  have  contributed  to  the  condition. 

Not  recognizing  the  compounding  of  change  (e.g.,  a  change  made  five  years  previously  combined 
with  a  change  made  recently)  is  a  potential  shortcoming  of  Change  Analysis.  Not  recognizing  the 
introduction  of  gradual  change  as  compared  with  immediate  change  also  is  possible. 

This  technique  may  be  adequate  to  determine  the  root  cause  of  a  relatively  simple  condition.  In 
general,  though,  it  is  not  thorough  enough  to  determine  all  the  causes  of  more  complex 
conditions. 

Figure  C-l  shows  the  six  steps  involved  in  Change  Analysis.  Figure  C-2  is  the  Change  Analysis 
worksheet. 
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The  following  questions  help  identify  information  required  on  the  worksheet: 

WHAT? 

•  What  is  the  inspection? 

•  What  occurred  to  require  the  inspection? 

•  What  occurred  prior  to  the  inspection? 

o  Was  the  inspection  area/surface  prepared  properly? 

•  What  occurred  following  the  inspection? 

•  What  activity  was  in  progress  when  the  inspection  was  being  performed? 

•  What  activity  was  in  progress  when  the  missed  crack  was  identified? 

o  Operational  evolution  in  the  work  space? 

■  Power  increase/decrease? 

■  Environmental  change  (temperature,  precipitation,  wind,  noise)? 

■  Starting/stopping  inspection  or  inspection  interruption? 

■  Inspection  access? 

■  Inspector  position  during  inspection? 
o  Operational  evolution  outside  the  work  space? 

■  Removing  inspection  equipment  from  service? 

■  Returning  inspection  equipment  to  service? 

o  Maintenance  activity  just  prior  or  at  time  of  inspection? 

■  Corrective  maintenance? 

■  Modification  or  configuration  change? 

■  Troubleshooting? 
o  Training  activity? 

•  What  inspection  equipment,  sensors  and  standards  were  involved  when  the  defect 
condition  was  missed  by  the  suspect  inspection? 

•  What  was  the  operational  condition  of  the  equipment,  standards  and  sensors? 

o  What  is  the  equipment’s  function? 
o  How  does  it  function? 
o  Were  operational  procedures  available? 
o  Was  the  equipment  operating  properly? 
o  Was  the  procedure  followed  properly? 

o  What  maintenance  or  calibration  has  been  performed  on  the  inspection 
equipment? 

o  What  modifications  have  been  made  to  the  equipment,  standards  or  sensors? 

•  What  inspection  equipment,  sensors  and  standards  were  involved  when  the  defect 
condition  was  identified? 

o  What  was  the  operational  condition  of  the  equipment,  standards  and  sensors? 
o  What  is  the  equipment’s  function? 
o  How  does  it  function? 
o  What  operational  procedures  were  available? 
o  Was  the  equipment  operated  properly? 

o  What  maintenance  or  calibration  has  been  performed  on  the  inspection 
equipment? 

o  What  modifications  have  been  made  to  the  equipment,  standards  or  sensors? 

■  Were  the  modifications  authorized? 
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WHEN? 

•  When  did  incident  (failure  to  detect  defect)  occur? 

•  When  was  the  defect  identified? 

•  What  was  the  facility’s  status  at  the  time  during  the  incident  (failed  inspection)? 

•  What  was  the  facility’s  status  at  the  time  when  the  defect  was  identified? 

•  What  effects  did  the  time  of  day  have  on  the?  Did  it  affect: 

o  Information  availability? 
o  Personnel  availability? 
o  Ambient  lighting? 
o  Ambient  temperature? 

•  Did  the  incident  involve  shift-work  personnel?  If  so: 

o  What  type  of  shift  rotation  was  in  use? 
o  Where  in  the  rotation  were  the  personnel? 

•  For  how  many  continuous  hours  had  any  involved  personnel  been  working? 

•  Did  the  incident  occur  at  the  beginning,  middle  or  end  of  the  work  week? 

•  Did  the  incident  occur  immediately  before  or  after  a  holiday? 

WHERE? 

•  Where  did  the  incident  (failed  inspection)  occur? 

•  Where  was  the  defect  detected? 

•  What  were  the  physical  conditions  in  the  area? 

•  Where  was  the  incident  identified? 

•  Was  location  a  factor  in  causing  the  condition? 
o  Human  factor? 

■  Lighting? 

■  Noise? 

■  Temperature? 

■  Equipment  labeling? 

■  Personal  protective  equipment  required  in  the  area? 

■  Accessibility? 

■  Indication  availability? 

■  Other  activities  in  the  area? 

■  What  position  is  required  to  perform  tasks  in  the  area? 
o  Equipment  factor? 

■  Humidity? 

■  Temperature? 

■  Cleanliness? 


HOW? 

•  Was  the  incident  (failed  inspection)  a  result  of,  or  caused,  by  an  inappropriate  action? 

o  An  omitted  action? 

o  An  extraneous  action? 

o  An  action  performed  out  of  sequence? 

o  An  action  performed  to  a  too  small  of  a  degree?  To  a  too  large  of  a  degree? 

•  Were  the  procedures  used  a  factor  in  the  incident  (failed  inspection)? 

o  Was  there  an  applicable  procedure? 
o  Was  the  correct  procedure  used? 
o  Was  the  procedure  followed? 

■  Followed  in  sequence? 

■  Followed  "blindly"-without  thought? 
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o  Was  the  procedure: 

■  Legible? 

■  Misleading? 

■  Confusing? 

■  An  approved,  current  revision? 

■  Adequate  to  do  the  task? 

■  In  compliance  with  other  regulations? 
o  Did  the  procedure: 

■  Have  sufficient  detail? 

■  Have  sufficient  warnings  and  precautions? 

■  Adequately  identify  techniques  and  components? 

■  Have  steps  in  the  proper  sequence? 

■  Require  adequate  work  review? 

■  Make  inaccurate  assumptions? 


WHO? 

•  Which  personnel: 

o  Were  involved  with  the  incident  (failed  inspection)? 
o  Missed  the  defect? 
o  Observed  the  inspection? 
o  Identified/detected  the  defect? 
o  Reported  the  defect? 

•  What  were: 

o  The  qualifications  of  these  personnel? 
o  The  experience  levels  of  these  personnel? 
o  The  work  groups  of  these  personnel? 
o  The  attitudes  of  these  personnel? 

•  Did  the  personnel  involved: 

o  Have  adequate  instruction? 
o  Have  adequate  supervision? 
o  Have  adequate  training? 
o  Have  adequate  knowledge? 
o  Communicate  effectively? 
o  Perform  correct  actions? 
o  Worsen  the  condition? 
o  Mitigate  the  condition? 
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Figure  Cl.  Six  Steps  Involved  in  Change  Analysis 


Change  Factor 

Difference/Change 

Effect 

Question  to  Answer 

What? 

(Conditions, ,  activity, 
equipment) 

When? 

(Occurred,  identified, 
schedule) 

How? 

(Work  practice,  omission, 
extraneous  action,  out  of 
sequence  procedure) 

Who 

(Personnel  involved,  training, 
qualification,  supervision 

Figure  C-2.  Change  A 

nalysis  Worksheet 
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APPENDIX  D 


HUMAN  PERFORMANCE  EVALUATION 

Human  performance  evaluation  is  the  assessment  of  the  human  interaction  in  the  performance  of 
specific  tasks.  Performance  of  human  dependant  tasks  can  be  subdivided  into  four  primary 
operations: 

a.  Input  detection 

b.  Input  understanding 

c.  Action  selection 

d.  Action  execution. 

Facility  and  equipment  operability,  procedures  and  documentation,  and  management 
attitudes  are  all  part  of  the  work  environment  that  needs  to  be  evaluated  for  each  of  these  steps. 
Common  human  factors  problems  to  be  considered  are: 

Inspector  Factors: 

•  Cognitive  overload 

•  Cognitive  underload/boredom 

•  Habit  intrusion 

•  Lapse  of  memory/recall 

•  Spatial  misorientation 

•  Mindset/preconceived  idea 

•  Tunnel  vision  or  lack  of  big  picture 

•  Unawareness 

•  Wrong  assumptions  made 

•  Reflect/instinctive  action 

•  Thinking  and  actions  not  coordinated 

•  Insufficient  degree  of  attention  applied 

•  Shortcuts  evoked  to  complete  job 

•  Complacency/lack  of  perceived  need  for  concern 

•  Contusion 

•  Misdiagnosis 

•  Fear  of  failure/consequences 

•  Fear  of  false-calls 

•  Fear  of  consequences  of  an  inspection  find  (affecting  mission) 

•  Tired/fatigued 

Management  Factors: 

•  Poor  task  management/assignment 

•  Insufficient  time  allotted  for  task 

•  Negative  reinforcement 

•  Negative  personality 

•  Inability  to  prioritize 
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•  Inability  to  delegate 

•  Inability  to  set  goals 

•  Inability  to  act  on  information 

•  Unawareness  of  personnel  capabilities/training 

•  Unawareness  of  responsibilities,  objectives  and  organizational  capabilities 

•  Production/time/mission  pressure 

•  Poor  communication  with  subordinate  personnel 

•  Poor  communication  with  superiors 

•  Unfamiliarity  with  task  requirements 

Where  high  risk  is  very  sensitive  to  noncompliance  with  requirements,  each  of  the  human 
performance  factors  should  be  considered  in  order  to  achieve  a  high  degree  of  reliability.  These 
factors  also  should  be  considered  in  system  design/control  and  operator  training,  as  well  as  causal 
factor  determination  and  corrective  action  decisions. 
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APPENDIX  E 


CAUSE  CODES 

Note:  Cause  Codes  can  be  tailored  to  a  specific  incident  being  investigated.  The  following 
Cause  Codes  have  been  developed  for  the  investigation  of  the  failure  of  a  non-destructive 
inspection  technique  to  detect  an  otherwise  detectable  defect  or  flaw  in  an  aircraft  structure. 

1.  Equipment/Material  Deficiency 

1A  =  Defective  probe,  sensor,  cable  or  connector 

IB  =  Defective  instrumentation 

1C  =  Damaged  equipment 

ID  =  Damage  during  shipping  or  error  in  marking 

IE  =  Equipment  out  of  calibration 

IF  =  Electrical  or  instrument  noise 

1G  =  Inadequate  surface  condition  or  surface  contamination 

1H  =  Other  equipment  or  material  deficiency 

2.  Procedure  Problem 

2A  =  Defective  or  inadequate  procedure 

2B  =  Lack  of  clarity  in  procedure 

2C  =  Lack  of  procedures  (no  procedures  provided) 

2D  =  Lack  of  currency  in  procedure  (i.e.  current  T.O.  not  available) 

2E  =  Procedure  not  properly  validated  or  capability  not  verified 

2F  =  Other  procedure  problems 

3.  Personnel  Error 

3A  =  Inattention  to  detail 

3B  =  Violation  of  requirement  or  procedure  (i.e.,  did  not  follow  T.O.) 

3C  =  Verbal  communication  problem 

3D  =  Did  not  perform  required  inspection 

3E  =  Other  human  error 

4.  Design  Problem 

4A  =  Inadequate  man-machine  interface 

4B  =  Inadequate  or  defective  NDI  equipment  design 

4C  =  Error  in  equipment  or  reference  standard 

4D  =  Error  in  detection  capability  assumption 

4E  =  Error  in  inspection  intcrval/damagc  propagation  rate  assumptions 

4F  =  Design  crack  scenario  not  representative  of  field  observation 

4G  =  Drawing,  specification,  or  data  errors 

4H  =  Other  design  problem 

5.  Training  Deficiency 

5 A  =  No  training  provided 

5B  =  Insufficient  practice  or  hands-on  experience 

5C  =  Inadequate  training  content 

5D  =  Insufficient  refresher  training 

5E  =  Inadequate  presentation  or  materials 
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5F  =  Improper  technician  qualification  level  used  for  inspection 

5G  =  Other  training  deficiency 

6.  Management  Problem 

6A  =  Inadequate  administrative  control 

6B  =  Work  organization/planning  deficiency 

6C  =  Inadequate  supervision 

6D  =  Improper  resource  allocation 

6E  =  Incorrect  priority  placed  on  inspection  tasks 

6F  =  Insufficient  time  allotted  for  inspection  tasks 

6G  =  Policy  not  adequately  defined,  disseminated,  or  enforced 

6H  =  Other  management  problem 

7.  External  Phenomenon 

7A  =  Poor  weather 

7B  =  Poor  ambient  inspection  conditions  (e.g.  too  hot  or  cold,  insufficient  or 

excessive  lighting) 

7C  =  Power  failure  or  transient 

7D  =  External  fire  or  explosion 

7E  =  Theft,  tampering,  sabotage,  or  vandalism 

7F  =  Location  and  accessibility  of  part  being  inspected 

7G  =  Physical  position  of  inspector  (ergonomics) 

7H  =  Amount  of  disassembly  required 

71  =  Other  external  phenomenon 
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APPENDIX  F 


CAUSAL  FACTOR  WORKSHEETS 

After  the  appropriate  root  cause  methods  have  been  used  to  identify  the  direct  cause(s),  the  root 
cause,  and  any  applicable  contributing  cause(s),  these  various  causes  can  be  categorized  by  using 
one  or  more  of  the  worksheets  in  this  appendix.  Each  of  the  seven  major  cause  worksheets  has  a 
matrix  to  list  the  applicable  subcategory  cause  for  each  finding.  The  same  subcategory  cause  may 
be  listed  for  up  to  four  similar  causes  under  columns  I  through  IV.  For  similar  causes,  fill  out 
Columns  I  through  IV  in  order  of  impact  or  importance.  . 

Worksheet  Instructions: 

1 .  Check  each  worksheet  as  applicable  or  not  applicable. 

2.  List  subcategory  cause  information  on  each  applicable  worksheet. 

a.  List  the  applicable  subcategory  cause  for  the  root  cause,  the  contributing  causes, 
and  the  direct  cause  by  placing  an  R,  C,  or  D  in  the  appropriate  box.  (The  same 
cause  may  be  listed  for  up  to  four  similar  findings;  for  example,  four  different  missed 
cracks). 

b.  Under  cause  description,  reference  each  cause  with  the  code  and  Roman  numeral 
from  the  matrix  and  describe  each  cause  (explain  how  it  was  related  to  the  incident). 

c.  Select  the  most  direct  causes  and  the  root  causes  (the  ones  for  which  corrective 
action  will  prevent  recurrence  and  have  the  greatest,  most  widespread  effect).  In 
cause  selection,  focus  on  programmatic  and  system  deficiencies  and  avoid  simple 
excuses  such  as  blaming  the  inspector.  Note  that  the  root  cause  or  causes  must  be 
an  explanation  (the  why)  of  the  direct  causes,  not  a  repeat  of  the  direct  cause.  In 
addition,  a  cause  description  is  not  just  a  repeat  of  the  category  code  description;  it  is 
a  description  specific  to  the  incident  (failed  inspection). 

d.  Under  recommended  corrective  actions,  list  the  action  intended  to  correct  each 
cause  to  prevent  recurrence.  Describe  the  corrective  actions  selected  to  prevent 
recurrence,  including  the  reason  why  they  were  selected,  and  how  they  will  prevent 
recurrence. 

3.  Transfer  the  direct,  the  root,  and  contributing  causes  and  the  corrective  actions  to  the 
Worksheet  Summary.  When  there  are  more  than  three  contributing  causes,  select  those  that  result 
in  the  greatest  and  most  widespread  improvement  when  corrected.  Rank  order  the  causes  in  order 
of  significance  in  columns  I  through  IV. 

(Note  that  even  though  only  three  contributing  causes  may  be  reported,  corrective  actions 
should  be  made  for  all  identified  causes). 
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1.  Equipment/Material  Worksheet 


Applicable 


]  Not  Applicable 


Why  was  "Equipment/Material"  a  Cause? 


Rate  each  cause  as  D,C  or  R 
for  each  of  the  applicable 
cause  codes: 

D  =  Direct  Cause 
C  =  Contributing  Cause 
R  =  Root  Cause 


Causes  | 

Equipment/Material  Problem  Cause  Codes 

i 

H 

in 

IV 

1A  =  Defective  probe,  sensor,  cable  or  connector 

IB  =  Defective  instrumentation 

1C  =  Damaged  Equipment 

ID  =  Damaged  equipment  during  shipping  or  error  in  marking 

IE  =  Equipment  out  of  calibration 

IF  =  Electrical  or  instrument  noise 

1G  =  Inadequate  surface  condition  or  surface  contamination 

1H  =  Other  Equipment/Material  Problem 

Cause  Description: 


Recommended  Corrective  Actions: 
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2.  Procedure  Problem 


|  |  Applicable 


|  |  Not  Applicable 


Why  was  a  "Procedure  Problem"  a  Cause? 


Rate  each  cause  as  D,C  or  R 
for  each  of  the  applicable 
cause  codes: 


D  =  Direct  Cause 
C  =  Contributing  Cause 
R  =  Root  Cause 


Causes 

Procedure  Problem  Cause  Codes 

1 

II 

III 

IV 

2A  =  Defective  or  inadequate  procedures 

2B  =  Lack  of  clarity  in  procedure 

2C  =  Lack  of  procedures  (no  procedures  provided) 

2D  =  Lack  of  currency  in  procedure  (T.O.  not  updated) 

2E  =  Procedures  not  properly  validated  or  capability  not  verified 

2F  =  Other  procedure  problem 

Cause  Description: 


Recommended  Corrective  Actions: 
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3.  Personnel  Error 


|  |  Applicable  |  |  Not  Applicable 

Why  was  "Personnel  Error"  a  Cause? 


Rate  each  cause  as  D,C  or  R 

Causes  | 

for  each  of  the  applicable 

Personnel  Error  Cause  Codes 

i 

M 

mi 

IV 

cause  codes: 

3A  =  Inattention  to  detail 

3B  =  Violation  of  requirement  or  procedure  (i.e.,  id  not  follow  T.O.) 

D  =  Direct  Cause 

3C  =  Verbal  communication  problem 

C  =  Contributing  Cause 

3D  =  Did  not  perform  required  inspection 

R  =  Root  Cause 

3E  =  Other  human  error 

Cause  Description: 


Recommended  Corrective  Actions: 
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4.  Design  Problem 


|  |  Applicable 


]  Not  Applicable 


Why  was  the  "Design"  a  Cause? 


Rate  each  cause  as  D,C  or  R 
for  each  of  the  applicable 
cause  codes: 


D  =  Direct  Cause 
C  =  Contributing  Cause 
R  =  Root  Cause 


Causes  [ 

Design  Problem  Cause  Codes 

i 

M 

mi 

IV 

4A  =  Inadequate  man-machine  interface 

4B  =  Inadequate  or  defective  NDI  equipment  design 

4C  =  Error  in  equipment  or  reference  standard 

4D  =  Error  in  detection  capability  assumption 

4E  =  Error  in  inspection  interval/damage  propagation  rate  assumption 

4F  =  Design  crack  scenario  not  representative  of  field  observation 

4G  =  Drawing,  specification  or  data  errors 

4H  =  Other  Design  Problem 

Cause  Description: 


Recommended  Corrective  Actions: 
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5.  Training  Deficiency 


|  |  Applicable  |  |  Not  Applicable 

Why  was  a  "Training  Deficiency"  a  Cause? 


Rate  each  cause  as  D,C  or  R 
for  each  of  the  applicable 
cause  codes: 

D  =  Direct  Cause 
C  =  Contributing  Cause 
R  =  Root  Cause 


Causes  [ 

Training  Deficiency  Cause  Codes 

i 

ii 

in 

IV 

5A  =  No  Training  provided 

5B  =  Insufficient  practice  or  hands-on  experience 

5C  =  Inadequate  training  content 

5D  =  Insufficient  refresher  training 

5E  =  Inadequate  presentation  materials 

5F  =  Inadequate  technical  qualification  level  used  for  inspection 

5G  =  Other  training  deficiency 

Cause  Description: 


Recommended  Corrective  Actions: 


52 


6.  Management  Deficiency 


|  |  Applicable 


]  Not  Applicable 


Why  was  a  "Training  Deficiency"  a  Cause? 


Rate  each  cause  as  D,C  or  R 
for  each  of  the  applicable 
cause  codes: 

D  =  Direct  Cause 
C  =  Contributing  Cause 
R  =  Root  Cause 


Causes  | 

Management  Deficiency  Cause  Codes 

1 

II 

III 

IV 

6A  =  Inadequate  administrative  control 

6B  =  Work  organization/planning  deficiency 

6C  =  Inadequate  supervision 

6D  =  Improper  resource  allocation 

6E  =  Incorrect  priority  placed  on  inspection  tasks 

6F  =  Insufficient  time  allotted  for  inspection  tasks 

6G  =  Policy  Not  Adequately  Defined,  Disseminated  or  Enforced 

6H  =  Other  management  problem 

Cause  Description: 


Recommended  Corrective  Actions: 
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7.  External  Phenomenon 


|  Applicable 


|  |  Not  Applicable 


Why  was  an  "External  Phenomenon"  a  Cause? 


Rate  each  cause  as  D,C  or  R 
for  each  of  the  applicable 
cause  codes: 

D  =  Direct  Cause 
C  =  Contributing  Cause 
R  =  Root  Cause 


Causes 

External  Phenomenon  Problem  Cause  Codes 

i 

M 

mi 

IV 

7A  =  Poor  weather 

7B  =  Poor  ambient  conditions  (e.g.  too  hot,  too  cold,  poor  lighting) 

7C  =  Power  failure  or  transient 

7D  =  External  fire,  explosion  or  emergency 

7E  =  Theft,  tampering,  sabotage  or  vandalism 

7F  =  Location  and  accessibility  of  part  being  inspected 

7G  =  Physical  position  of  inspector  (ergonomics) 

7H  =  Amount  of  disassembly  required 

71  =  Other  external  phenomenon 

Cause  Description: 


Recommended  Corrective  Actions: 
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Worksheet  Summary 


Problem/Deficiency  Category 

Root  Cause 

Operational 
Readiness  Problem 

Equipment/ 
Material  Problem 

Procedure  Problem 

Personnel  Error 

Management/Field 
Bridge  Problem 

Design 

Problem 

Training 

Deficiency 

Management  Problem 

External  Phenomenon 

Cause  Description: 


Corrective  Actions: 
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APPENDIX  G 


EXAMPLE  INCIDENT  ANALYSIS 


Incident  Description 

A  30  day  time  compliance  technical  order  (TCTO)  was  released  requiring  the  eddy  current 
inspection  of  the  wing  flap  track  on  a  dual  engine  US  Air  Force  aircraft.  The  TCTO  tech  data 
was  successfully  validated  by  depot  NDI  and  structural  engineering  prior  to  TCTO  release.  The 
TCTO  was  scheduled  to  be  accomplished  on  21  December  2004,  on  A/C  90-678  prior  to  a 
scheduled  7:00  AM  training  mission.  The  aircraft  was  grounded  pending  successful  performance 
of  the  TCTO.  The  inspector  assigned  to  the  inspection  task  was  late  arriving  to  work.  No  other 
inspection  personnel  were  available  at  the  time  due  to  holiday  vacations.  To  expedite  flight 
preparations,  the  crew  chief  rolled  the  aircraft  out  of  the  hangar  at  8:00  AM  to  perform  an  engine 
check.  At  9:35  AM  the  inspector  arrives  at  work  and  is  assigned  the  task  by  his  supervisor.  At 
10:00  AM  the  inspector  arrives  at  the  aircraft  and  performs  the  inspection.  The  outdoor  ambient 
temperature  was  25°F  with  light  winds  at  the  time  of  the  inspection.  The  indoor  ambient 
condition  where  the  inspection  equipment  was  stored  was  65°F.  On  27  February  2005,  A/C  90- 
678,  experienced  an  in-flight  failure  of  the  left  engine,  declared  an  in-flight  emergency  and 
landed  safely  at  a  local  municipal  airport.  The  incident  was  classified  in  a  Class  A  mishap  as  it 
resulted  in  greater  than  $500K  of  damage  the  aircraft. 

A  subsequent  failure  analysis  revealed  that  the  left-wing  flap-track  failed,  resulting  in  the  in-flight 
liberation  of  a  mounting  bracket.  The  bracket  was  ingested  by  the  engine,  resulting  in  FOD 
damage  and  engine  failure.  Laboratory  analysis  indicted  the  flap  track  failed  due  to  fatigue. 
Analysis  also  indicates  that  the  crack  was  between  a  NDI  and  <  acri,icai(a  NDi  but  <  acrmCai)  was 
present  during  the  previous  inspection. 

Findings: 

The  resulting  investigation  revealed  that  the  inspector  violated  procedural  requirements  and 
elected  to  ignore  TCTO  requirements  for  aircraft  hangaring  during  inspection.  Furthermore  the 
inspector  did  not  allow  inspection  equipment,  probes  and  reference  standard  to  reach  ambient 
conditions  prior  to  beginning  the  inspection.  Interviews  with  the  inspector  also  revealed  the 
inspector  used  the  incorrect  “unshielded”  probe.  A  shielded  probe  was  required  by  the  part 
specific  procedure  to  inspect  around  a  ferromagnetic  fastener.  The  inspector  inadvertently 
selected  the  incorrect  probe  and  only  realized  the  mistake  during  the  inspection.  Due  to  schedule 
and  pervasive  command  pressures  the  inspector  elected  to  complete  the  inspection  with  the 
“unshielded”  probe.  Failure  to  use  the  correct  probe  combined  with  the  failure  to  perform  the 
inspection  in  a  temperature  controlled  environment  (hanger)  resulted  in  inspection  noise, 
excessive  instrument  drift  and  misinterpretation  of  inspection  results,  ultimately  leading  to  the 
failure  to  detect  an  otherwise  detectable  fatigue  crack. 

•  Direct  Cause:  Failure  of  the  inspector  to  use  the  correct  probe  during  the  inspection. 

•  Contributing  Causes 

o  Failure  of  the  inspector  to  perform  the  inspection  in  a  temperature  controlled 
environment. 

o  Pressure  to  complete  the  inspection  when  unprepared. 

•  Root  Causes: 

o  Failure  of  management  to  ensure  sufficient  personnel  resources  to  address 
mission  needs. 
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Failure  of  the  organization  to  enforce  a  culture  of  professionalism  and  safety- 
consciousness  in  the  workplace. 


Crew  chief  decides  to 
expedite  prep  while  waiting  for 
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vacation  and  sick 


1.  Equipment/Material  Worksheet 


|  |  Applicable 


|XX  |  Not  Applicable 


Why  was  "Equipment/Material"  a  Cause? 


Rate  each  cause  as  D,C  or  R 
for  each  of  the  applicable 
cause  codes: 

D  =  Direct  Cause 
C  =  Contributing  Cause 
R  =  Root  Cause 


Causes 

Equipment/Material  Problem  Cause  Codes 

1 

II 

III 

IV 

1A  =  Defective  probe,  sensor,  cable  or  connector 

IB  =  Defective  instrumentation 

1C  =  Damaged  Equipment 

1 D  =  Damaged  equipment  during  shipping  or  error  in  marking 

1 E  =  Equipment  out  of  calibration 

IF  =  Electrical  or  instrument  noise 

1G  =  Inadequate  surface  condition  or  surface  contamination 

1H  =  Other  Equipment/Material  Problem 

2.  Procedure  Problem 

|  |  Applicable  |XX  |  Not  Applicable 


Why  was  a  "Procedure  Problem"  a  Cause? 


Rate  each  cause  as  D,C  or  R 
for  each  of  the  applicable 
cause  codes: 


D  =  Direct  Cause 
C  =  Contributing  Cause 
R  =  Root  Cause 


Causes 

Procedure  Problem  Cause  Codes 

1 

II 

III 

IV 

2A  =  Defective  or  inadequate  procedures 

2B  =  Lack  of  clarity  in  procedure 

2C  =  Lack  of  currency  in  procedure  (T.O.  not  updated) 

2D  =  Lack  of  currency  in  procedure  (T.O.  not  updated) 

2E  =  Other  procedure  problem 
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3.  Personnel  Error 


|XX  |  Applicable  |  |  Not  Applicable 

Why  was  "Personnel  Error"  a  Cause? 


Rate  each  cause  as  D,C  or  R 

Causes  ( 

for  each  of  the  applicable 

Personnel  Error  Cause  Codes 

1 

II 

III 

IV 

cause  codes: 

3A  =  Inattention  to  detail 

3B  =  Violation  of  requirement  or  procedure  (i.e.,  id  not  follow  T.O.) 

D 

C 

D  =  Direct  Cause 

3C  =  Verbal  communication  problem 

C  =  Contributing  Cause 

3D  =  Did  not  perform  required  inspection 

R  =  Root  Cause 

3E  =  Other  human  error 

Cause  Description: 

3A-I  Violation  of  Requirement  or  Procedure.  Direct  Cause:  The  inspector  elected  to  violate  procedure  and  use  incorrect 
"unshielded"  probe. 

3A-II  Violation  of  Requirement  or  Procedure.  Contributing  Cause:  The  inspector  elected  to  ignore  procedure  requirements 
to  perform  inspection  in  a  temperature  stable  environment  i.e.  hangar. 


Recommended  Corrective  Actions: 

1 .  T rain  personnel  on  the  role  of  inspections  to  maintain  flight  safety  and  the  critical  nature  of  the  task 

2.  Emphasize  a  Safety-First  culture 


4.  Design  Problem 


|  |  Applicable 


|XX  |  Not  Applicable 


Why  was  the  "Design"  a  Cause? 


Rate  each  cause  as  D,C  or  R 
for  each  of  the  applicable 
cause  codes: 


D  =  Direct  Cause 
C  =  Contributing  Cause 
R  =  Root  Cause 


Causes 

Design  Problem  Cause  Codes 

1 

II 

III 

IV 

4A  =  Inadequate  man-machine  interface 

4B  =  Inadequate  or  defective  NDI  equipment  design 

4C  =  Error  in  equipment  or  reference  standard 

4D  =  Error  in  detection  capability  assumption 

4E  =  Error  in  inspection  interval/damage  propagation  rate  assumption 

4F  =  Design  crack  scenario  not  representative  of  field  observation 

4G  =  Drawing,  specification  or  data  errors 

4H  =  Other  Design  Problem 
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5.  Training  Deficiency 


|  |  Applicable  |XX  |  Not  Applicable 

Why  was  a  "Training  Deficiency"  a  Cause? 


Rate  each  cause  as  D,C  or  R 
for  each  of  the  applicable 
cause  codes: 

D  =  Direct  Cause 
C  =  Contributing  Cause 
R  =  Root  Cause 


Causes  | 

Training  Deficiency  Cause  Codes 

i 

M 

mi 

IV 

5A  =  No  Training  provided 

5B  =  Insufficient  practice  or  hands-on  experience 

5C  =  Inadequate  training  content 

5D  =  Insufficient  refresher  training 

5E  =  Inadequate  presentation  materials 

5F  =  Inadequate  technical  qualification  level  used  for  inspection 

5G  =  Other  training  deficiency 

6.  Management  Deficiency 


XX  |  Applicable 


]  Not  Applicable 


Why  was  a  "Training  Deficiency"  a  Cause? 


Rate  each  cause  as  D,C  or  R 
for  each  of  the  applicable 
cause  codes: 

D  =  Direct  Cause 
C  =  Contributing  Cause 
R  =  Root  Cause 


Causes  [ 

Management  Deficiency  Cause  Codes 

i 

II 

III 

IV 

6A  =  Inadequate  administrative  control 

6B  =  Work  organization/planning  deficiency 

6C  =  Inadequate  supervision 

6D  =  improper  resource  allocation 

R 

6E  =  Incorrect  priority  placed  on  inspection  tasks 

R 

6F  =  Insufficient  time  allotted  for  inspection  tasks 

C 

6G  =  Policy  Not  Adequately  Defined,  Disseminated  or  Enforced 

6H  =  Other  management  problem 

Cause  Description: 

6D-I  Improper  Resource  Allocation.  Root  Cause.  Supervision  failed  to  ensure  sufficient  qualified  personnel  were  scheduled 
to  work  to  meet  mission  needs 

6E-I  Incorrect  Priority  Placed  on  Inspection  Tasks.  Organization  and  management  culture  emphasizes  schedule  over  safety. 

6F-I  Insufficient  Time  Alotted  for  Inspection  Task.  Undue  pressure  was  placed  on  the  inspector  to  rapidly  complete  the 
safety-critical  inspection.  This  pressure  indirectly  influenced  inspector  to  ignore  specific  procedural  guidance  in  order  to 
rapidly  clear  the  aircraft  for  flight. 

Recommended  Corrective  Actions: 

1 .  Seek  efficiencies  in  other  operations  to  free  up  additional  time  for  inspection 

2.  Conduct  manpower  review  to  assess  availability  of  qualified  personnel  to  meet  mission  needs  including  primary  and  backup  personnel 

3.  Establish  and  enforce  policies  to  establish  inspection  as  a  Safety-Critical  event  that  will  drive  schedule 

4.  Hold  management  accountable  for  manpower  availability 
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7.  External  Phenomenon 


|XX  |  Applicable  |  |  Not  Applicable 

Why  was  an  "External  Phenomenon"  a  Cause? 


Rate  each  cause  as  D,C  or  R 
for  each  of  the  applicable 
cause  codes: 

D  =  Direct  Cause 
C  =  Contributing  Cause 
R  =  Root  Cause 


Causes  ] 

External  Phenomenon  Problem  Cause  Codes 

i 

II 

III 

IV 

7A  =  Poor  weather 

7B  =  Poor  ambient  conditions  (e.g.  too  hot,  too  cold,  poor  lighting) 

c 

7C  =  Power  failure  or  transient 

7D  =  External  fire,  explosion  or  emergency 

7E  =  Theft,  tampering,  sabotage  or  vandalism 

7F  =  Location  and  accessibility  of  part  being  inspected 

7G  =  Physical  position  of  inspector  (ergonomics) 

7H  =  Amount  of  disassembly  required 

71  =  Other  external  phenomenon 

Cause  Description: 

7B-I  Poor  Ambient  Conditions.  Contributing  Cause  -  Low  outdoor  ambient  temperature  resulted  in  excessive  instrument  drift. 


Recommended  Corrective  Actions: 

1 .  Retrain  inspectors  on  effects  of  unstable  ambient  conditions  on  inspection  stability  and  the  criticality  of  following 
procedural  guidance  with  an  emphasis  on  requirements  for  hangared  inspections. 
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Worksheet  Summary 


Problem/Deficiency  Category 

Direct  Cause 

Root  Cause 

Contributing 

Cause 

Operational 
Readiness  Problem 

Equipment/ 
Material  Problem 

Procedure  Problem 

Personnel  Error 

3B-I 

3B-II 

Management/Field 
Bridge  Problem 

Design 

Problem 

Training  Deficiency 

Management  Problem 

6D,  6E 

6F 

External  Phenomenon 

7B 

Cause  Description: 

The  direct  cause  if  the  incident  was  the  inspector  violated  procures  and  used  an  incorrect 
"unshielded"  probe  during  the  inspection.  Contributing  causes  were  a)  the  inspector  was 
pressured  to  rush  the  inspection  b)  the  inspector  elected  to  violate  procedures  and  perform 
an  unhangared  inspection  and  c)  the  cold  ambient  outdoor  conditions  resulted  in  excessive 
instrument  drift.  Root  causes  were  the  supervisor  did  not  ensure  sufficient  manpower  to 
meet  mission  needs  and  the  overarching  organizational  structure  emphasizing  schedule 
over  safety  concerns. 

Corrective  Actions: 

1 .  Seek  efficiencies  in  other  operations  to  free  up  additional  time  for  inspection. 

2.  Conduct  manpower  review  to  assess  availability  of  qualified  personnel  to  meet  mission 
needs  including  primary. 

3.  Establish  and  enforce  policies  to  establish  inspection  as  a  Safety-Critical  event  that  will 
drive  schedule. 

4.  Hold  management  accountable  for  manpower  availability. 

5.  Train  personnel  on  the  role  of  inspections  to  maintain  flight  safety  and  the  critical 
nature  of  the  task. 

6.  Emphasize  a  Safety-First  culture. 

7.  Hold  inspection  personnel  accountable  to  completely  follow  inspection  tech  data 
requirements. 
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